In the last few decades and as a result, many national and international brands have evolved in Pakistan.
Entrepreneurs make use of brands as the principal factor of differentiation to the advantage of aggressive benefit on different competition, gambling being an imperative function in the triumph of companies [1].
Clothes today are made from a wide range of different materials. Traditional materials such as cotton, linen and leather are still sourced from plants and animals [2].
[1] Kamran, A., Dawood, M. U., Rafi, S. K., Butt, F. M., & Akhtar, K. (2020). Impact of Brand Name on Purchase Intention: A Study on Clothing in Karachi, Pakistan. International Journal of Innovation, Creativity and Change, 278-293.
[2] Objective, C. (2021, December 10). What Are Our Clothes Made From? Retrieved from https://www.commonobjective.co/article/what-are-our-clothes-made-fromAlt Text
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
from sklearn.impute import SimpleImputerdf1=pd.read_excel('/Users/snawaz/Documents/pychilla2/teamproject_sep3/Deep_note_linked/Sales.xlsx')df=df1.copy()rows,cols=df.shape
print("Number of rows in dataset are",rows)## Number of rows in dataset are 670082
print("Number of columns in the dataset are",cols)## Number of columns in the dataset are 45
df.info()## <class 'pandas.core.frame.DataFrame'>
## RangeIndex: 670082 entries, 0 to 670081
## Data columns (total 45 columns):
## # Column Non-Null Count Dtype
## --- ------ -------------- -----
## 0 BillNo 670082 non-null object
## 1 BillDate 670082 non-null datetime64[ns]
## 2 LoyaltyCard 33720 non-null object
## 3 Customer 661558 non-null object
## 4 Description 66014 non-null object
## 5 BillMonth 670082 non-null object
## 6 Warehouse 670082 non-null object
## 7 RegionName 670082 non-null object
## 8 Location 670082 non-null object
## 9 Category 670082 non-null object
## 10 DepartmentName 670082 non-null object
## 11 BrandName 668530 non-null object
## 12 CoBrand 670082 non-null object
## 13 Barcode 670082 non-null int64
## 14 DesignNo 670082 non-null object
## 15 Rejection 670082 non-null object
## 16 SeasonName 670082 non-null object
## 17 Attribute1 670057 non-null object
## 18 Attribute2 143125 non-null object
## 19 Attribute3 309262 non-null object
## 20 Attribute4 670082 non-null object
## 21 Attribute5 337322 non-null object
## 22 Attribute6 0 non-null float64
## 23 Attribute7 0 non-null float64
## 24 Attribute8 490234 non-null object
## 25 LocalImport 670082 non-null object
## 26 Color 670082 non-null object
## 27 Sizes 670082 non-null object
## 28 DiscountType 502886 non-null object
## 29 SalesmanName 670082 non-null object
## 30 Qty 670082 non-null int64
## 31 SalesReturnReason 14786 non-null object
## 32 Price 670082 non-null int64
## 33 Amount 670082 non-null int64
## 34 SaleExclGST 670082 non-null float64
## 35 GSTP 670082 non-null int64
## 36 GST 670082 non-null int64
## 37 DiscPer 670082 non-null float64
## 38 DiscAmount 670082 non-null float64
## 39 BarcodeDiscPer 670082 non-null int64
## 40 BarcodeDiscount 670082 non-null int64
## 41 NetAmount 670082 non-null float64
## 42 PointsEarned 670082 non-null int64
## 43 TaxPer 670082 non-null int64
## 44 Cobrand Acc 670082 non-null object
## dtypes: datetime64[ns](1), float64(6), int64(10), object(28)
## memory usage: 230.1+ MB
DT::datatable(head(py$df,20), options = list(pageLength = 5,scrollX=T))for column in df.columns :
print('Number of unique data for {0} is {1}'.format(column , len(df[column].unique())))
print('unique data for {0} is {1}'.format(column , df[column].unique()))
print('=====================================')## Number of unique data for BillNo is 439665
## unique data for BillNo is ['SALM-010116-00003' 'SALM-010116-00009' 'SALM-010116-00011' ...
## 'SDMT-241218-00112' 'SDMT-271218-00017' 'SDMT-271218-00018']
## =====================================
## Number of unique data for BillDate is 1089
## unique data for BillDate is ['2016-01-01T00:00:00.000000000' '2016-02-01T00:00:00.000000000'
## '2016-03-01T00:00:00.000000000' ... '2018-12-23T00:00:00.000000000'
## '2018-12-25T00:00:00.000000000' '2018-12-30T00:00:00.000000000']
## =====================================
## Number of unique data for LoyaltyCard is 5344
## unique data for LoyaltyCard is [nan 12 26 ... 32356 32370 32433]
## =====================================
## Number of unique data for Customer is 73677
## unique data for Customer is ['.' 'Mr.Shoaib sadiique' 'Mr.Adil' ... ' M HAYAT KHAN ' ' MRS IMRAN '
## ' AIMEN IBRAHIM ']
## =====================================
## Number of unique data for Description is 11788
## unique data for Description is [nan 'Exchnage' 'Exchange' ... 'EMP-2261' ' I.D 2261' 'Blazer']
## =====================================
## Number of unique data for BillMonth is 36
## unique data for BillMonth is ['2016-01' '2016-02' '2016-03' '2016-04' '2016-05' '2016-06' '2016-07'
## '2016-08' '2016-09' '2016-10' '2016-11' '2016-12' '2017-03' '2017-01'
## '2017-02' '2017-05' '2017-04' '2017-06' '2017-07' '2017-08' '2017-09'
## '2017-11' '2017-12' '2017-10' '2018-02' '2018-01' '2018-08' '2018-06'
## '2018-07' '2018-04' '2018-03' '2018-05' '2018-11' '2018-10' '2018-09'
## '2018-12']
## =====================================
## Number of unique data for Warehouse is 2
## unique data for Warehouse is ['No' 'Yes']
## =====================================
## Number of unique data for RegionName is 7
## unique data for RegionName is ['3-NORTH ' '1-KARACHI' '5-WAREHOUSES' '2-LAHORE' '4-CENTRAL PUNJAB '
## '7- EXCLUDE SUB STORES' '6-SUB STORE ']
## =====================================
## Number of unique data for Location is 67
## unique data for Location is ['ALAM STORE' 'AL SAEED SUPER STORE ' 'TCS Atrium Mall'
## 'SANA ENTERPRISES ' 'TCS BEVERLY CENTER'
## 'TCS CLASSIC DEPARTMENTAL STORE (CLOSED)'
## 'CG Main Store / A-5 2nd floor (CLOSED)' 'TCS DOLMEN CITY '
## 'TCS Dolmen Mall -Hyderi' 'TCS Dolmen Mall - Tariq Road'
## 'ENEM ENTERPRISES (CLOSED)' 'EXHIBITION ' 'TCS Fountain-Avenue-Lhr'
## 'TCS The Forum ' 'TCS FORTRESS SQUARE ' 'GALAXY PLUS'
## 'TCS HYDERI BLOCK-H ' 'HKB DEFENCE (CLOSED)' 'HKB LIBERTY '
## 'GULGASHT TOWN - MULTAN' 'HASSAN ENTERPRISES (CLOSED)'
## 'TCS KINGS MALL GUJRANWALA' 'TCS Chen-One-Tower- (CLOSED)'
## 'CAMBRIDGE ONLINE STORE' 'TCS BAHADURABAD ' 'TCS GULSHAN'
## 'TCS Park Tower - 2' 'RAJA SAHIB LINK ROAD' 'RAJA SAHIB LIBERTY'
## 'TCS Sialkot Cantt' 'SANAULLA & CO (CLOSED)' 'SANA VENTURE (CLOSED)'
## 'SANA STYLE ' 'THE SHOPPE-2' 'TCS MODERN SADDAR HYD' 'TCS WADUD SONS'
## 'TCS Z BLOCK DHA' 'TCS RCG MALL - FAISALABAD'
## 'RAJA SAHIB - WAPDA TOWN (CLOSED)' 'TCS SAFA GOLD MALL - G.FLR'
## 'ZEEN ONLINE STORE' 'TCS IDREES BOOK ZEEN- RWP' 'A-5 LOOSE WAREHOUSE '
## 'TCS AMANAH MALL - LAHORE' 'TCS LUCKY ONE - CAMBRIDGE'
## 'TCS PACKAGES MALL - LAHORE' 'TCS GIGA MALL-WTC-Cambridge'
## 'TCS LUCKY ONE - ZEEN' 'GALAXY PLUS 1 (M.IRFAN)'
## 'KORANGI EDHI STORE (PRESS DEPARTMENT)'
## 'TCS NISHAT EMPORIUM - 2 (Cambridge)' 'TCS GIGA MALL-WTC-Zeen'
## 'A-5 LOOSE WAREHOUSE (GROUND FLOOR)' 'B-53 LOOSE WAREHOUSE '
## 'DO BURJ - FSD - ZEEN' 'TCS Zeen Dolmen Mall Hyderi'
## 'TCS ZEEN DOLMEN CITY' 'GOJRA - ZEEN (RAFIQ CENTRE)'
## 'HKB DEFENCE - Y-BLOCK' 'TCS NISHAT EMPORIUM - ZEEN' 'RCG MALL - ZEEN'
## 'SIALKOT (2) - ZEEN' 'TCS SAFA MALL - ZEEN' 'TCS ZEEN ATRIUM MALL '
## 'TCS Y - BLOCK - LAHORE' 'TCS ZEEN DOLMEN TARIQ ROAD'
## 'TCS GUJRANWALA SATELLITE TOWN']
## =====================================
## Number of unique data for Category is 1
## unique data for Category is ['MENS SHIRT ']
## =====================================
## Number of unique data for DepartmentName is 2
## unique data for DepartmentName is ['LICENSE ' 'CAMBRIDGE ']
## =====================================
## Number of unique data for BrandName is 19
## unique data for BrandName is ['LICENSE FULL SLEEVE ' 'LUXER' 'EXECUTIVE ' 'PRINCIPLE SHIRT '
## 'CAMBRIDGE CASUAL ' 'PORT FOLIO SHIRT ' 'CAMBRIDGE Since 1958'
## 'ARISTO SHIRT' 'TOMORROW ' 'LICENSE HALF SLEEVE ' 'DESIGN STUDIO '
## 'CAMBRIDGE HALF SLEEVE ' 'AFTER HOURS ' 'PRIVILEGE SHIRT ' nan
## 'CAMBRIDGE FULL SLEEVE ' 'ACTIVE SHIRT ' 'PERSONALLY CAMBRIDGE'
## 'ZERO TOLERANCE ']
## =====================================
## Number of unique data for CoBrand is 107
## unique data for CoBrand is ['LICENSE F/S ' 'LUXER PLAIN F/S' 'Cambridge Executive F/S'
## 'PRINCIPLE PLAIN F/S ' 'PRINCIPLE CLASSIC F/S ' 'LUXER F/S'
## 'PRINCIPLE SWAN F/S' 'CAMBRIDGE CASUAL F/S' 'PORT FOLIO (SHADES) F/S '
## 'PORT FOLIO F/S WCB ' 'PORT FOLIO SHIRT YARN DYED' 'NO IRON EVER'
## 'DENIM F/S ' 'LUXER PLAIN WCB F/S' 'CAMBRIDGE SINCE 1958' 'ARISTO '
## 'EXECUTIVE H/S ' 'PRINCIPLE CLASSIC H/S' 'ARISTO MASON' 'OVERDYE F/S '
## 'LUXER H/S' 'TOMORROW F/S ' 'LICENSE H/S '
## 'PRINCIPLE POPLIN LUXE MILANO F/S ' 'PRINTED F/S SHIRTS'
## 'PRINCIPLE Y/DYED LUXE MILANO F/S ' 'PORT FOLIO Y/D H/S'
## 'D STUDIO DESIGNERS SHIRT' 'CAMBRIDGE CASUAL H/S ' 'OXFORD H/S'
## 'TOMORROW H/S ' 'AFTER HOURS F/S ' 'CHEMBREY F/S '
## 'PRINCIPLE OXFORD F/S ' 'CHEMBREY H/S' 'AGE OF WISDOM F/S '
## 'PORT FOLIO H/S WCB ' 'PRIVILEGE F/S ' 'SEERSUCKER F/S SHIRT'
## 'LIGHT WEIGHT H/S ' 'LIGHT WEIGHT F/S ' 'PORTFOLIO SATEEN F/S '
## 'DEAD SHIRT F/S ' 'ITALIAN' 'AFTER HOURS H/S '
## 'SEERSUCKER H/S SHIRT' 'LICENSE ' 'OXFORD LICENSE F/S'
## 'HERRING BONE F/S' 'MELANGE YARN DYED F/S ' 'FLANNEL F/S '
## 'SHARP CAMBRDIGE ' 'CAMBRIDGE SHIRT' 'DOBBY F/S' 'Cotton Linen F/S '
## 'ESSENTIALS MENS FORMAL SHIRTS' 'Cotton Slub F/S' 'OXFORD LICENSE H/S'
## 'ACTIVE F/S' 'PERSONALLY CAMBRIDGE' 'D STUDIO F/S '
## 'ZERO TOLERANCE F/S ' 'F/S DOBBY' 'F/S HERRING BONE '
## 'F/S MELANGE YARN DYED ' 'F/S LICENSE '
## 'F/S CAMBRIDGE EXECUTIVE ' 'F/S LUXER PLAIN WCB ' 'F/S LUXER YARN DYED '
## 'F/S ESSENTIALS MENS FORMAL SHIRTS' 'F/S ARISTO ' 'F/S ARISTO MASON'
## 'F/S NO IRON EVER' 'F/S LUXER PLAIN ' 'F/S PRINCIPLE CLASSIC YARN DYED'
## 'F/S DENIM ' 'H/S LICENSE ' 'H/S OXFORD LICENSE ' 'H/S LUXER'
## 'F/S SHARP CAMBRDIGE ' 'H/S EXECUTIVE' 'F/S OVERDYE '
## 'F/S OXFORD LICENSE ' 'F/S PRINCIPLE PLAIN ' 'F/S COTTON LINEN '
## 'H/S PRINCIPLE CLASSIC ' 'F/S PRINCIPLE POPLIN LUXE MILANO'
## 'F/S DEAD SHIRT ' 'F/S PORT FOLIO SHIRT YARN DYED' 'F/S PRINTED SHIRTS'
## 'H/S SEERSUCKER SHIRT' 'F/S PRINCIPLE SWAN ' 'F/S CHEMBREY'
## 'H/S CHEMBREY ' 'F/S CAMBRIDGE SINCE 1958' 'F/S COTTON SLUB '
## 'F/S AFTER HOURS ' 'F/S SEERSUCKER SHIRT' 'F/S FLANNEL'
## 'F/S AGE OF WISDOM ' 'CAMBRIDGE UNIFORM ' 'F/S LICENSE '
## 'F/S LIGHT WEIGHT ' 'F/S TOMORROW ' 'F/S CAMBRIDGE CASUAL'
## 'F/S PRINCIPLE Y/DYED LUXE MILANO ' 'H/S PORT FOLIO Y/D ']
## =====================================
## Number of unique data for Barcode is 25740
## unique data for Barcode is [492146 464028 464945 ... 591320 597473 577000]
## =====================================
## Number of unique data for DesignNo is 5881
## unique data for DesignNo is ['B5393' 'B6335' 'B6336' ... 'B10767' 'B10728' 'BU321']
## =====================================
## Number of unique data for Rejection is 2
## unique data for Rejection is ['No' 'Yes']
## =====================================
## Number of unique data for SeasonName is 31
## unique data for SeasonName is ['WINTER 2015 - 2016' 'SUMMER 2015' 'SUMMER 2016' 'SUMMER 2014'
## 'SUMMER 2011' 'WINTER 2014 - 2015' 'EID FESTIVAL 2015'
## 'EID FESTIVAL 2013' 'WINTER 2013 - 2014' 'WINTER 2010 - 2011'
## 'WINTER 2012 - 2013' 'SUMMER 2013' 'EID FESTIVAL 2014'
## 'WINTER 2011 - 2012 ' 'Opening' 'SUMMER 2012' 'EID FESTIVAL 2012'
## 'EID FESTIVAL 2011' 'SUMMER 2010' 'EID FESTIVAL 2016' 'WINTER 2016-2017'
## 'EID AL ADHA 2016' 'SUMMER 2017' 'EID FESTIVAL 2010' 'WINTER 2017 - 2018'
## 'EID FESTIVAL 2017' 'EID AL ADHA 2017' 'SUMMER 2018' 'EID FESTIVAL 2018'
## 'WINTER 2018 - 2019' 'EID AL ADHA 2018']
## =====================================
## Number of unique data for Attribute1 is 13
## unique data for Attribute1 is ['1 Year & Above (Discounted)' 'No Stock' 'Winter Stock' 'Obsolete'
## 'Rejection' 'B Category' 'Active (Fresh)' 'Cut Range Items '
## 'Summer Sale ' 'Summer Hold Stock ' 'WINTER ACTIVE ' 'WINTER DISCOUNTED '
## nan]
## =====================================
## Number of unique data for Attribute2 is 6
## unique data for Attribute2 is ['Buy 1 Get 1 free' nan 'Regular Fit' 'Slim Fit' 'Comfort fit'
## 'Modern fit']
## =====================================
## Number of unique data for Attribute3 is 4
## unique data for Attribute3 is [nan 'PAKISTAN ' 'IMPORT-S' 'CHINA ']
## =====================================
## Number of unique data for Attribute4 is 1
## unique data for Attribute4 is ['CAMBRIDGE']
## =====================================
## Number of unique data for Attribute5 is 3
## unique data for Attribute5 is [nan 'A' 'B']
## =====================================
## Number of unique data for Attribute6 is 1
## unique data for Attribute6 is [nan]
## =====================================
## Number of unique data for Attribute7 is 1
## unique data for Attribute7 is [nan]
## =====================================
## Number of unique data for Attribute8 is 4
## unique data for Attribute8 is [nan 'Regular' 'Premium' 0]
## =====================================
## Number of unique data for LocalImport is 2
## unique data for LocalImport is ['Local' 'Import']
## =====================================
## Number of unique data for Color is 388
## unique data for Color is ['Forest Teal' 'Pool Blue' 'L/GREEN ' 'BLUE '
## 'MEHROON ' 'MIX ' 'L/BLUE ' 'Ultra Voilet'
## 'PINK/WHITE ' 'DEEP MELON' 'NUGGET' 'YELLOW ' 'BLACK Plaid'
## 'WHITE ' 'SKY BLUE ' 'L/GREY ' 'TURQ '
## 'GREY ' 'BROWN/BLUE ' 'RED ' 'Red/white'
## 'Crystal Blue' 'CREAM ' 'M/BLUE ' 'STONE '
## 'MAROON ' 'PURPLE ' 'D/GREY ' 'ROSE DAWN'
## 'NAVY ' 'L/PINK ' 'ROYAL BLUE ' 'WHITE/BLUE '
## 'BLACK ' 'PURPLE/BLACK ' 'WHITE/RED ' 'BLUE/WHITE '
## 'NAVY STRIP ' 'OFF WHITE ' 'Mushroom ' 'BLUE/NAVY'
## 'PINK ' 'PURPLE STRIPE ' 'GREY/WHITE ' 'BLUE/GREY '
## 'VIOLET ' 'LILAC ' 'Sea Green ' 'WHITE/BROWN '
## 'RED/YELLOW' 'J/BLACK ' 'GREY/PINK ' 'BLUE/BROWN '
## 'FRENCH BLUE' 'BLUE/RED ' 'BLACK/WHITE ' 'INK BLUE '
## 'GOLDEN ' 'GREY2' 'BROWN ' 'BLUE/GREEN '
## 'White/Black ' 'D/BLUE ' 'OCEAN ' 'OLIVE '
## 'NAVY BLUE ' 'Blue/LILAC' 'VAPOR GREY ' 'RED/BLUE '
## 'WHITE/GREY ' 'ORANGE ' 'Multi ' 'African Violet'
## 'WHITE/PINK' 'WHITE/PURPLE ' 'PINK/BLUE' 'Grey/Blue '
## 'BLUE/PURPLE' 'GREEN ' 'RED PLAID' 'Night Shade Blue'
## 'NAVY/RED ' 'KHAKI ' 'Burgundy ' 'BLACK/BLUE '
## 'BLUE/YELLOW ' 'CHARCOAL ' 'RED STRIPE ' 'GREY/GREEN '
## 'BLUE/BLACK ' 'RED WOOD ' 'PEARL PINK ' 'Military Green'
## 'GREEN/BLUE ' 'BANANA GREEN' 'WHITE/NAVY' 'PLAE RED' 'FALL LEAF'
## 'BEIGE ' 'L/PURPLE ' 'WHITE/MEHROON' 'WHITE1 '
## 'PURPLE/WHITE ' 'OPTICAL/WHITE ' 'BLACK/PURPLE ' 'YELLOW/PURPLE'
## 'FAWN ' 'BIKING RED' 'NAVY/WHITE ' 'L/YELLOW '
## 'Grey/Black ' 'BROWN/WHITE ' 'Tea Pink ' 'AQUA '
## 'RIVER BLUE ' 'Wild Ginger' 'RED BUD' 'BLUE/PINK '
## 'FRESH SALMON' 'CREAM1' 'Ultramarine' 'TURKISH TILE' 'D/PURPLE '
## 'BLACK/RED ' 'BEIGE/BLUE ' 'GREY/RED' 'ORANGE/GREY '
## 'PEACH ' 'WHITE/GOLDEN ' 'Carrel' 'ORANGE STRIPE '
## 'Yellow/White' 'WHITE STRIPE ' 'Blue check ' 'Turquoise' 'Red/Navy'
## 'Sky blue/Black' 'Blue Strip ' 'D/GREEN ' 'L/BROWN '
## 'INDIGO ' 'ROSE CLOUD ' 'RUST ' 'D/Mehroon '
## 'Green Stripes ' 'AQUA/WHITE' 'PURPLE/GREY ' 'NAVY/GREEN '
## 'ROYAL ' 'Orange Com' 'LEMON ' 'BLACK/NAVY'
## 'C/BLUE ' 'Oxford tan' 'AQUA BLUE' 'Maroon/Navy' 'Aqua Gray'
## 'RED/BLACK ' 'WHITE/ORANGE ' 'SEA BLUE ' 'R/BLUE '
## 'BLUE/RUST ' 'Black/Grey ' 'WINE' 'NAVY/GREY' 'Palace Blue'
## 'MEHNDI ' 'HIGH RISE' 'SHADOW PURPLE ' 'ECRU OLIVE '
## 'PINK/GREY ' 'BROWN/GREY ' 'NAVY/BEIGE' 'PASTEL BLUE'
## 'BEIGE/WHITE ' 'WHITE/GREEN ' 'Brown/Black ' 'OLIVE/GREY '
## 'SHADOW GREY' 'BLUE/ORANGE ' 'MONUMENT' 'FAWN/GREY '
## 'YELLOW/BLUE ' 'QUICK SILVER ' 'Black/Green' 'C/SEA' 'BLACK/SILVER '
## 'Sky blue/Navy' 'IRON' 'Ballad Blue ' 'N/BLUE ' 'BLACK1 '
## 'WHITE CHECK' 'STAR GAZER' 'SAND ' 'NAVY/INDIGO '
## 'Red Check ' 'RED/GREY' 'OLD NAVY ' 'NAVY/YELLOW'
## 'MISTGREY/BLACK ' 'DEEP WATER BLUE' 'L/BLUE STRIPE' 'LIME GREEN '
## 'GREY/MAROON' 'Grey Check ' 'DUST BLUE' 'NAVY CHECK '
## 'LILAC STRIPE ' 'GREY/NAVY' 'INDIGO RED' 'INDIGO BLUE' 'Dot Navy'
## 'S/BLUE ' 'PLACID BLUE' 'LILAC CHECK ' 'Dot Blue'
## 'D/NAVY ' 'INDIGO YELLOW' 'SNOW WHITE ' 'Dot Red' 'Geo Blue'
## 'Dot Green' 'T/Yellow' 'M/GREY ' 'WALNUT '
## 'STRAIGHT BLUE' 'SILVER PINK' 'GREEN/WHITE ' 'AQUAMARIN '
## 'ALASKAN BLUE ' 'Evening Haze' 'B/RED ' 'NAVY/BROWN '
## 'Vintage Violet' 'Ice Blue' 'SMOKE GREEN' 'Blue Bonnet' 'C/RED '
## 'N-CHOCLATE ' 'NAVY/L.BLUE ' 'Dot Grey' 'Self Blue' 'V-YELLOW'
## 'WHITE/LILAC' 'SKY WHITE ' 'CHATEAU GREY' 'TEAL '
## 'PEARL WHITE' 'CORAL ' 'Black/GOLDEN' 'BRIDAL ROSE'
## 'HARBOUR BLUE' 'METAL ' 'PEACH BLUSH ' 'VANILLA ICE '
## 'Cerulean White' 'DARK BLUE ' 'PURPLE/LILAC' 'RED OXIDE'
## 'Mid Night Blue ' 'FOAM GREEN' 'DRESS BLUE ' 'BLUE/WHITE2'
## 'BLUE/PAISLEY' 'Orchid' 'BLACK/PAISLEY' 'BLACK/FLORAL' 'Lavender'
## 'WHITE/FLORAL' 'NIRVANA' 'D/BROWN ' 'Black Strip '
## 'BROWN/PURPLE ' 'TURKISH COFFEE ' 'Cherry' 'WHITE9'
## 'WHITE3 ' 'WHITE6' 'WHITE7' 'BLUE/KHAKI' 'WHITE8' 'WHITE5'
## 'WHITE2 ' 'WHITE4 ' 'BLUE/WHITE 1 ' 'NAVY/BLUE'
## 'DENIM BLUE' 'MINERAL BLUE' 'FUCHSIA ' 'EGYPHAN BLUE' 'Electric Blue'
## 'CAROLINA BLUE' 'Sky blue/White' 'CORN FLOWER BLUE' 'BLUE/WHITE3' 'BLUE1'
## 'WHITE/L.BLUE' 'BLACK/WHITE DOBBY' 'Aqua sky' 'NAVY/PURPLE '
## 'Rust/White' 'Ocher/White' 'ORANGE/WHITE ' 'RED DOBBY' 'GREY/ROYAL'
## 'Ocher/Grey' 'SKY/RED' 'BLACK DOBBY ' 'ROYAL/SKY' 'PLUM' 'Grey/Brown'
## 'FROST ' 'GLACIER GREY ' 'WHITE12' 'NAVY/LILAC' 'WHITE11'
## 'L/BLUE DOBBY' 'Grey/Purple' 'Sky Dobby' 'DUSTY PINK' 'Navy/Sky Blue'
## 'Blue Berry' 'D/GREY STRIPE' 'BLACK/FEROZI' 'BROWN STRIP '
## 'PURPLE CHECK' 'Pink check' 'ORANGE CHECK' 'WHITE10' 'D.BLUE/WHITE'
## 'GREEN/NAVY ' 'ORANGE/BLUE' 'MINT ' 'O.WHITE/NAVY '
## 'M/DARK GREY ' 'WHITE/SKY' 'NAVY/SILVER ' 'GREEN/LILAC'
## 'WHITE DOTTED ' 'Black dot' 'Red CIRCLE' 'Blue Turqoise' 'ROYAL WHITE'
## 'WHITE/L.BROWN' 'BLUE/OFFWHITE' 'BLACK FLORAL' 'ORANGE/NAVY' 'D/CYAN'
## 'ROYAL BLUE/GREEN' 'GREEN/BLACK ' 'D/PINK' 'D/FAWN ' 'L/NAVY'
## 'C/GREY ' 'PINION' 'Blue Dobby' 'NAVY/OCHR' 'SWIASS DOT'
## 'White/Turquoise' 'WHITE/PEACH' 'NAVY/OFFWHITE ' 'SKY/BROWN'
## 'WHITE/L.PURPLE ' 'D.BROWN/WHITE' 'Grey Strip '
## 'Black/Maroon' 'Blue Stripes ' 'NAVY/PINK' 'Blue dot' 'GREY DOBBY'
## 'PEACH STRIPE' 'GREEN/PURPLE' 'Maroon/White ' 'WHITE/FEROZI '
## 'BEIGE/BROWN ' 'AQUA/YELLOW' 'WHITE PINK ' 'Crystal Pink '
## 'RED/BROWN ' 'S/L BLUE ' 'Cayenne' 'Strom Grey '
## 'Maroon/Grey' 'BROWN/NAVY ' 'SAGE ' 'FAWN/WHITE'
## 'WHITE DOBBY']
## =====================================
## Number of unique data for Sizes is 26
## unique data for Sizes is ['LAR ' 16 '14½ ' '15½ ' '16½ ' 17 'MED ' 'X-LAR' '17½ ' 15 'MEDIM'
## 'SML ' 'LARGE' 'SMALL' 'XLARG' 'XX-LR' 'X-SML' 'XXLRG' '16½' 'MIX' '15¾'
## '15½' '17½' '14½' 'MIX ' '18½ ']
## =====================================
## Number of unique data for DiscountType is 10
## unique data for DiscountType is [nan 'No Discount' 'Director Relatives' 'Group Discount'
## 'Employee Discount' 'Special Promotions' 'Bundle' 'Promotion' 'EPP'
## 'A Suit for Every Occasion']
## =====================================
## Number of unique data for SalesmanName is 796
## unique data for SalesmanName is ['840 WASEER' '836 AMEER ZAIB' '2582 AWAIS' '2558 BABAR' '2581 WAQAR '
## '837 M. HASNAIN ALI' '839 AMJAD ALI' '0086 M HABIB' '2187 SHAFIQ SARWAR'
## '0602 IRFAN AZIZ' '831 M SHOAIB' '2460 BILAL' '751 SAAD ARIF'
## '2339 ARSHAN' '1106 FAIZ AHMED KHAN' '1050 ZUBAIR HUSSAIN' '2677 UZAIR'
## '2625 ALI ' '2604 AMJAD KHAN' '1214 ZEESHAN AHMED' '1945 M HASEEB'
## '826 ZEESHAN' '2540 MALIK ' '824 IMTIAZ AHMED' '822 SHEZAD KHAN'
## '825 RAEES AKBER' '0017 M MAQSOD' '2662 MUSAWIR SHARIF'
## '852 SYED WAQAR HUSSAIN SHAH' '1170 SADDAM HUSSAIN' '851 KAMRAN HUSSAIN'
## '859 ADNAN MUKHTAR' '1949 AZHAR MEHMOOD' '861 M USMAN' '827 SAMIR'
## '1929 RASHID KHAN' '1606 Rauf Khan' '1944 KASHIF QURESHI' '866 KASHIF'
## 'NAVEED JUMMA' 'NASIR SOOMRO' 'FAROOQ YAQOOB' '709 Ali' '745 M UMAIR'
## '749 SADIQ ' '1939 Irfan Khan' '2341 LUBNA' '1593 SYED AHSAN ALI'
## '2395 SAQIB' '743 M AMIR S' '744 WAQAR KHAN' '752 IRFAN' '750 UMAID '
## '2619 Sohail' '748 CYNTHIA WILSON' '784 SARAH' '714 JAMAL'
## '697 SHOAIB KHAN' '700 KHALID' '2002 M FARHAN ALI' '2523 FAHAD '
## '2285 M ILYAS ' '2693 KHALIL ' '696 ABDUL RAZZAQ' '2340 HARIS '
## '2168 DANISH' '2763 ADNAN' '2104 NABEEL AHEMD' '711 FAIZAN'
## '1686 AHMED MIRZA' '1689 DANIAL AKHTAR' '1037 NAVEED KAUSAR'
## '792 WASIF KARIM ' '2347 KHUSHBOO' '2113 M WAQAS' '708 S.M.AKBAR SHAFI'
## '2337 Saniya' '1086 AHSAN ALI' '1085 DANIYAL ' '2206 M SAQIB'
## '712 QURYAT' '1057 WAQAS AHMED' '798 SEHRISH YAQOOB'
## '2702 JANNAT IMTIAZ ' '1036 ZUBAIR SHAHID' '1081 ASAD' '965 QAISER '
## '900 NAVEED ASGHAR' '2350 AMIR' '0068 NOMAN ZAHOOR' '1245 ABDUL SAMAD'
## '0862 M SALMAN' '2534 DANISH' '2647 M SALMAN' '923 SALMAN YOUNAS'
## '738 SHEERAZ AHMED' '1868 ELVIS GEORGE' '740 SHAHID BAIG'
## '732 AURANGZAIB' '814 SHAN ALI' '0067 RANA RAHEEL HAQ' '966 M ANSAR SHAH'
## '905 SHEHROZ ' '968 MAFFIA' '1162 M.ATEEB GHANI' '950 KANWAL '
## '2055 VICKY MAHSI' '927 MAQSOOD AHMED' '0083 M FAIZAN'
## '1000 M.SALAHUDDIN' '999 SHEBAZ RIAZ' '926 M USMAN TANVEER'
## '1678 Abdul Rehman ' '929 SALEEM' '702 SHAHZAIB ' '2401 AREEB'
## '2483 UFAQ' '2409 SALMAN' '2240 MAQSOOD ' '765 M.NAVEED SIDDIQ'
## '791 M HARIS' '2241 BASIT' '2621 AMMAR' '805 RABIA KHAN' '2768 TAHIR'
## '0659 AZYAAT NOOR' '951 ASIF' '2461 NOMAN ' '2633 A BASIT' '2589 MUZAMIL'
## '952 SAJID RAZA' '935 M ADNAN' '2272 SUMAIR' '993 FAISAL BHATTI'
## '908 JUNAID JAMEEL' '2446 ZOHAIB' '2391 ANIQ ' '2660 SHEHZAD AYUB'
## '0220 M SHEHBAZ BARKA' '975 TABISH SALEEM R' '2747 SABIR ' '984 IMRAN'
## '2598 SHAFIQ' '2602 AHSAN' '985 OMAN' '2601 TAHIRA' '2731 ROA SAJAWAL'
## '2599 SHAMEEM' '986 NIDA JAMEEL' '846 WAQAS HUSSAIN' '847 NOMAN GILLANI'
## '848 ZEESHAN' '2058 S AQEEL ' '0062 JAVED IQBAL' '2634 USMAN '
## '0628 MUZAFFAR HUSSAI' '2508 S.RIAZ' '2535 RAZA' '2506 RASHID '
## '976 M IRFAN' '2511 BAKHTAWAR' '962 M IRFAN' '961 FARAN NAZAR '
## '964 SOHAIL ZAFAR' '2398 SHAHRUKH' '1881 ABDUL RAHEEM HANIF'
## '2397 BENISH' '730 JAFFAR AKBER' '2144 KHURRAM ' '1588 AMBREEN'
## '2592 RIDA' '775 MANZOOR AHMED' '753 ARSALAN' '1502 M.TAIMOOR'
## '1993 M RIZWAN ' '1103 HASSAN' '1123 HASAN' '2630 IMRAN' '2402 SHAHYER'
## '2280 FAHAD RAFEEQ' '1556 AMMAR SALEEM' '2610 ZOYA' '2768 ANAS '
## '2754 WAQAS' '1585 MASOOR UL HAQ' '1148 BABER IRFAN' '2433 AQEEL'
## '914 SHARJEEL SAEED' '994 JAMSHAID AHMED' '918 ASAD' '1012 BILAL'
## '1011 UMER BUTT' '0010 GHULAM ALI AKBA' '2273 SHAMIM' '0026 M NADEE'
## '2388 Nayla ' '2425 RABIA' '936 Muzahir ' '1664 ASAD YOUNUS' '2435 ROZI'
## '2322 Adnan Maqbool' '0944 IMRAN' '1103 KASHIF UR REHM' '2520 Shahrukh'
## '2759 QAVI' '1764 M NASIM BAIG' '875 ASHFAQ HUSSAIN' '2549 M ABDUL'
## '2640 NOMAN ' '2654 WASEEM BOOTA' '860 M WASEEM' '2348 MAZHEAR'
## '2670 BABAR SHEHZAD' '2668 SAEED' '2648 ARSALAN FIDA'
## '865 SYED AHMED SHAH' '2671 AMMAR' '864 SADDIQUE AKBAR'
## '863 SHAHZADA A R' '862 HASAN RAZA' '2667 WASEEM ' '1946 FARIS JAN'
## '902 WASIM FAROOQI' '2448 BILAL KHALID ' '1015 AZAM PERWAIZ'
## '2590 FAISAL' '0266 JUNAID AHMED' '1596 FAHAD MUGHAL' '880 QASIM'
## '882 SAJID' '2656 SYED IRFAN ALI' '164 IMRAN ALI' '2386 M Shehzad'
## '0723 SYED ALI HASSAN' '2536 BILAL ZIA' '2642 AWAIS ABBAS' '2389 RASHIDA'
## '1018 MARIA ' '2744 KHALID ' '2773 WAQAS' '2829 MOIZ' '2835 GHAZANFER '
## '1090 MUDASSIR ' '1099 FAIZAN' '2890 TALHA' '2271 SADDAM RASHEED'
## '854 MARIA' 'JAVED QAMAR' '1091 SUMBAL' '2846 MUZAFFAR' '2810 HAMZA'
## '1096 ABID' '754 MEHREEN' '1097 AHSAN' '698 FAIZAN' '1092 ASAD'
## '2816 S.WAHAB ' '2833 SHAFI UDDIN' '724 ERUM' '2830 FURQAN '
## '2847 OWAISE' '2719 AMIR RAIZ' '953 USMAN' '734 SALMAN' '1053 WAQAS'
## '733 WAQAS SALAMAT' '969 USMAN' '945 KANWAL ' '928 M. Laique'
## '2709 Jwahir' '2826 KAZIM' '2767 FARHEEN' '2771 KINZA' '2808 FARHAN'
## '1112 ARIF ' '2827 AJMAL' '1082 ADEEL ' '2888 MOHSIN' '2886 SUFIYAN'
## '1084 IMRAN' '2836 M.FAIZAN' '729 AWAIS' '741 MUDDASIR' '755 BILAL'
## '1088 NOMAN' '2794 IMTIAZ' '939 A BASIT ' '940 WAQAS' '938 WAQAS'
## '2737 SABIR ' '2742 RIZWAN' '0223 SYED AHMED ALI' '2812 ANIQA'
## '2774 SONIA' '2733 HAMZA' '762 IMRAN' '761 SAQIB' '2815 ZOHAIB'
## '0415 SELEEM AHMED' '0916 WAQAS' 1426 '2879 HARIS ZAREEN' '2932 IJAZ'
## '2635 SHAFIQ' '2937 ASAD' '2915 NIGHET' '2943 ASIF' '2896 NOMAN '
## '719 KAMRAN' '2852 HAMZA' '715 Ali Zia' '2854 ZEESHAN' '2428 HAMZA'
## '954 SAEED' '2929 SHAH ZAIB' '2935 TAHIR' '1110 HASSAM'
## '1043 SALMAN AFZAL KH' '930 AZEEM ' '2953 Faizan uddin' '2861 ALINA'
## '2951 Ahtesham Khan' '2900 MUSARRAT' '2819 MEHWISH' '2916 ASGHER'
## '977 ARSALAN' '978 AFIFA ' '963 MEHTAB GUL' '2899 ADEEL'
## '2898 SADIA KHAN' '2979 Farhan' '2954 Anzar' 'hamza' '777 M.Akhtar'
## '915 KAMRAN ' '2820 NABEEL YOUSAF' '2876 JUNAID BUTT' '1002 A QADEER'
## '2960 ZIA' '1001 MOEED' '2962 FAIZAN' '1014 ALI' '3005 SHABIR ASLAM'
## '2914 SHABAN' '876 MUSTANSAR AYYAZ' '2973 TOUQEER' '2976 ZAIN'
## '3017 S TAIMOOR SHAH' '867 SANOBIR MASIH' '887 ASAD' '903 JALEEL'
## '2931 SADIA NAZIR' '3008 M IMRAN' '3009 RANA M AZAZ'
## '3016 TEHZEEM UL HAQ' '833 SAQIB ISHFAQ' '2999 AHSAN ABBASI'
## '3057 Faizan Anwer' '716 M.MAIRAJ' '3071 MIRZA TAIMOOR'
## '3024 ZESHAN JAVEED' '3030 NASIR ABBAS' '828 M.Waqar'
## '3135 RAHEEL ISMAIL' 'Saqib Al' '3102 M. Shoaib' '3053 Hira Naz'
## '3070 BABAR ' '713 MEHREEN' '3105 Ureedullah' '3054 Duke Alexander'
## '1007 SALMAN AFZAL ' '931 M.Azeem' '3160 Mohsin Raza' '3049 MUSHFIQ'
## '796 WAQAS' '795 RASHAD' '3058 Saima Malik' '3064 A GHANI' '797 NIDA'
## 'Faiz ul haq' '3148 Humza Asif' '3149 R.Zulfiqar ' '3122 Babar'
## '3018 RAJA DANISH' '811 QAMAR' '3073 MOIN ' '767 ZAHEER' '769 SAAD IQBAL'
## '768 M.Haroon ' '3045 RIZWAN' '3097 Umar Masood' '2877 MATEEN '
## '1003 Noman Ashiq' '1004 Farhan Liaquat ' '942 Salman'
## '877 Mubarak Shah' '878 S.Farhan' '2994 AMEER HAMZA'
## '3023 MUTQEMUN.NISA' '3021 IZHAR SHARIF' '3028 SUMMON GULL'
## '3027 FIRYAL JORGH' '3094 MUBASHAR ' '889 ZULFIQAR ALI' '1052 AMIR'
## '2997 MALIHA KHAN' '3026 SAMIA SHAKEEL' '13239 Subhan' '13275 AA'
## '13227 M.Rashid' '857 Naqash Masih' '856 M.Waqas' '855 Zulqarnain'
## '788 Noman M.Ashraf' '1111 M.ASIM ' '758 M JAWED' '756 ASAD' '699 M MUAZ'
## '13240 Usama Ahmed' '3204 A.Wahab' '813 M.Saqib'
## '13293 Muhammad Rizwan Khan' '901 Rafaqat' '920 zaigham Altaf '
## '921 USMAN BUTT' '919 M.Usman' '971 Bilal Ahmed' '970 Saira Mustaqeem'
## '3216 UMER HAYAT' '1113 Hamza Khan' '815 HAMZA ' '799 SYEDA KANZA'
## '907 Rustam Khan' '13220 S.Ahmed Hussain' '989 S.Asim Bukhari'
## '987 Aroosa' '3141 ISMAIL KHAN' '3182 Arsalan Aslam' '3140 M.Rizwan'
## '3206 Iqbal' '770 Huma Naz' '778 Zain ul Abdeen' '13238 M.Mohsin'
## '808 IMRAN ' 'M.Umair Aleem' '1105 A.Shahid' '1061 S.Talha'
## '739 M.Furqan' 'New ' '774 Shaikh' '916 Hassan Nadeem'
## '917 FAIZ UL HASAN' '3196 S.Adnan Al' '943 Mehmoona Shoaib'
## '941 HAFIZA SHUMAILA' '1008 JENNY' '13219 M.Sagheer' '922 BEHROZ ASLAM'
## '868 KAFEEL ' '13228 Uzma Noreen' '3190 Khurram' '3249 JALAL'
## '3252 SAQIB NAZIR' '3253 SUZANA' '13217 Abdullah Afzal'
## '3197 Nabeel Saleem' '2500 J ADEEL' '956 Sumaira Hassan'
## '955 Ahsan Ayyaz' 'M.Shahzad' '3285 shamoon masih' '1054 SHEIKH NADEEM'
## '823 Zeeshan Ahmed' '816 ZOHAIB ' '757 TAJDAR KHAN' '718 ABDUL WAKEEL'
## '717 Rashid Manzoor' '710 SYED NOMAN ALI' '995 Qasim Rehman'
## '3359Fahad Munir' '933 Tallat Mahmood' '0006 NAIM AHMED'
## '794 M.SALEEM BASATHIA' '3280 Arsalan Aslam' '980 Muhammad Nafeel'
## '802 NAVEED AHMED' '803 WAHEED KHAN' '806 RABIYA' '817 MUHAMMAD FAISAL'
## '818 Muhammad Obaid' '819 WAQAS HANIF' '3366 NAEEM ABAS ZAIDI'
## '3335 OMAIR' '812 HASSAN' '771 SHEERAZ UR REHMAN' '1060 IMRAN MUSTAQEEM'
## '780 IMRAN JAMI ' '779 OWAIS' '3297 ZAHID HUSSAIN' '735 MUHAMMAD OSAMA'
## '1005 NOMAN KHAN ' '946 SHOAIB' '3312 RANA AHMED AKASH' '731 ASIF'
## '3315 SOBIA PERVAIZ' '871 FOUZIA AZIZ' '1062 Noman Ishaque'
## '872 Ghayoor Hussain' '869 SONIA MAROOF' '881 RABIA'
## '894 Saqib Abbas Rathore' '898 Shahzeb Shaukat' '895 Naeem Ahmed'
## '897 Syed Qasim Raza Shah' '3278 Khalid Hussain' '899 Amjad Butt'
## '896 Walait Hussain' '1153 FAISAL ALI' '1156 M.ZEESHAN'
## '01231 Muhammad Junaid' '1017 Rashid Arshad' '904 Majid Hussain'
## '3048 SABINA ' '807 ali raza' '849 Ammad Maroof' '756 Muhammad Asad'
## '1136 IRZAM' '1222 M HAROON' '3400 TAHA AHMED' '1016 Hafiz abdul ghafoor'
## '1024 HASHIM HUSSAIN ' '1022 ZAINAB ' '1021 KIRAN ' '1125 SAEED '
## '997 MUHAMMAD ZOHAIB' '1134 uMAIR' '1165 IMRAN ALI' '1163 RAJA MUZAMMIL'
## '1164 ALI AZAAN' '1194 M FAHAD KHAN' '3362Rabia' '1185 aBDUL'
## '832 IRFAN AZIZ' '1191 Shahid' '1258 KASHIF MEHMOOD' '1182 Zeeshan'
## '1234 Samreen Bux' '1302 Faisal Azeem' '1276 Saqib Ul Hassan'
## '1456 Rizwan Rehman' '1250 Arsalan Shah' '1118 Makhdoom' '1089 adnan'
## '1306 Fahad Azeem' '1249 Ali Naseer' '1471 Ashir Rasheed'
## '1301 Ahmed Alam' '947 QASIM' '1296 SUHAIM IRFAN ' '1143 FIDA HUSSAIN'
## '1194 Kainat' '1193 Sehrish' '1407 Ghulam Ali' '1445 Nazim Khan'
## '1275 Muhammad Hammad Siddiqui' '800 M Waqas' '1124 ZUBAIR'
## '1457 Adnan Saleem' '704 GULSHAN KHAN' '1256 ALI HASSAN'
## '1270 AQEEL RASHEED' '1187 Zeeshan' '1154 M.Ali' '983 Salman Qureshi'
## '990 Asma Azam' '1152 M QASIM' '981 MUHAMMAD' '979 RAHEEL'
## '1111 MUSTAFA JAVED' '1289 Syed Hasnain Alam' '1273 Hamza Asad'
## '1292 Farzana Shoukat' '1020 ROCK' '1283 ASAD YOUNUS' '1298 Saima Naz'
## '01230 Tabbassum Nazeer Bhatti' '1291 Muhammad Aamir '
## '1434 Fahad Faheem' '1279 MUHAMMAD SHAFFAY' '1406 Zubair Ahmed'
## '911 FIDA HUSSAIN' '1244 Tayiba Khalid' '1259 NABEEL YOUSUF'
## '917 M.Faiz ul Hassan' '1405 M' 'SC02 STAFF' 'SC04 STAFF' 'SC01 STAFF'
## 'SC05 STAFF' 'SC06 STAFF' '1162 Muhammad N.' '1260 SOBIA PERWAIZ'
## '1262AHMED RAZA' '1087 A.WAHAB' '0009 FARIS' '883 FARHAN'
## '1126 IMRAN YASEEN' '1265 HAYAT KHAN' '1269 HARIS NAVEED'
## '1278 SHUMAILA RAEES' '1288 SHEZA ASHRAF' '1632 ABDUL REHMAN'
## '1937 Faisal Aman' '1180 Daud' '01476 Bilal Zulfiqar'
## '1487 Jafar Raza Jafri' 'Zulfiqar ali' '1480 Nafasat Ali' '1976 kashif'
## 'Salman Arif' '1243 Muhammad Osama Khan' '1609 Syed Danish' 'Hira Rasool'
## '1274 Muhammad Huzaifa' '01153 Uzair Rafiq' '1239 Usman Sarwar'
## '01503 Fahad Khan' '1618 Khurram' '01623 Kamran Shahzad'
## '01624 Kamran Nadeem' '1479 Muhammad Fayyaz Ghazi' '01938 Azeem Abbas'
## '1969 Bilal Umer' '1173 Ali' '1906 Masroor Ahmed' '3317 M.AMJAD'
## 'Miss Amara' 'Imtiaz Mumtaz' '1789 Faizan farasat ' '1980 ALI '
## 'Fahad Ali ' 'Ghazanfar Ali' '2200 ABDUL REHMAN' '1410 Sobia Khan'
## '01930 Zohaib Ali' '2799 MUZAMMIL' '1436 Muhammad Wajahat '
## '1416 HAMMAD ' '1271 HUMAIRA RAMZAN' '1464 Hammad Qaisar'
## 'Muhammad Shahzad' 'Umar farooq' '1483 Ahsan Aslam' '1411 Haseeb Khan '
## 'Syed Ateeb Hussain' 'Haris Khan' 'Hamza Khan' '1431 Abdul Wahab'
## '1433 Mohammad Tayyab ' '1418 Saba Khan' '1437 Noreen Aksha'
## '01512 Asadullah' '1522 Danish Khan' 'AHmed Mujahid' '01606 Waqas Anwer'
## '01474 Syed Muzaffar Hussain' 'Ishfaq Khalil' 'Mohammad Awais'
## '1873 Zubair Amjad' '01862 Usman Khan' 'M Qaiser Irshad'
## '0469 HAFIZ ADIL ' '1486 Syed Faraz Hussain' '1106 IMTIAZ MUMTAZ'
## 'Yasir Gulistan' '1885 Nargis Hameed' '1281 Urooba Mehmood'
## '01506 Dur E Shahwar Zehra' 'Roshan Aftab' '1408 Muzaffar Ali Shah'
## '1631 USMAN ALI KHAN' 'Syed Afroz Shah' 'Aqsa Fatima'
## '01934 Qasim Saleem' 'Adnan Shah' '01444 Zeeshan Anwer' 'Haider Ali'
## 'Hassan Abbas' '1492 ABDUL WAHAB' '1818 M.IMRAN' '1010 Anjum Farooq'
## '1495 HUMAIRA ILYAS' '01993 Mohammad Salman' '1794 NADEEM RAIQUE '
## '1998 Noshaba Amir' '1490 Sohail Pervez' 'SC03 STAFF'
## '1488 Rabia Shaheen ' '01475 Palwasha Farooq' '1442 Azhar Jahangir '
## '1489 Saba Arif' '1710 ILYAS ' '1626 HASSNAIN '
## '2020 NAHEEM AHMED QURESHI' '1484 Faiz Muhammad' '01625 Muhammad Asad'
## 'MUHAMMAD AMIR' '1485 Amjad Hussain' '1413 Syed Azam Abbas' '1135 aSIM'
## '1518 Zeeshan Ullah' '890 Usman Ali' 'MUHAMMAD ASIF BUTT'
## 'AHSAN HAMMAD ' '1494 MISS ANSA'
## '01956 Muhammad Hasnain' '1262 AHMED RAZA' 2201 '01948 Toqeer Ahmed'
## 'Abdul wahab' '2202 MALIK ISRAR' 'Afaq Asad' 'Abdul Rehman'
## 'Umair Shabeer' 'Azeem ' '1234 Samreen Buksh' 'Samreen Buksh'
## '2227 Danish' 'Robinson' 'Haseeb Tahir' '1876 Shoukat Ahmed Khan'
## 'Bilal..' 'Fazal.' '1083 MUZAMMIL' '1108 S.GHUFRAN AHMED'
## '01835 Rameez Aslam' '1107 S.Aijaz ' '01872 Fahad Baig' 'Waseem Ul haq'
## '1150 FIDA' 'Amna' 'MUdassir Hussain' 'Umer hayat' '2091 Aurangzaib'
## '1059 Sana Saeed' 'Kaneez Yasmeen' '3025 M TAYYAB' '1104 BILAL AHMED'
## '1102 FAIZAN SIDDIQUI' '1115 SALEEM' ' Wajid al Hussaini' 'SHAKIR'
## 'Waqar Hassain' '2318 HAMZA' '01931 Mohsin Ali' '1107 TARIQ HUSSAIN'
## 'Zeeshan' 'Daniyal.' 'Bilal' '1788 MANAN' '2196 WAQAR ' '2197 TAJAMMAL '
## '2278 DILAWAR' 'Mohsin Butt' '2294 IMRAN ' '01960 Sheikh Mohsin'
## '1935 Amir Masih' 2377 '2085 Sana Ullah Butt' '01870 Muhammad Shahbaz'
## '2190 SAJAWAL' 'Daniyal' 'Nebeel' 'Sumaira bibi' 'Asad Anwar'
## '1455 Miuhammad Imran' '1100 WAQAR KHAN ' 'AHSAN Ali' '88 Muhammad Waqas'
## '1932 Tassawar Abbas' '1516 AHSAN HAMMAD '
## 'MUHAMMAD MOAZZAM' '2279 M BILLAL ' 2320 '1501 Syed Abdul Basit'
## 'Waqas Zarmoin' 'Adnan Haider' 'Usama Haider' 'Usama Iqbal' 'Sohail'
## 'Fazal qadeer' 'Fahad Shoukat' '2159 Shahbaz ' 'UreedUllah' 'Awais Raza']
## =====================================
## Number of unique data for Qty is 60
## unique data for Qty is [ 1 -1 2 3 4 -2 6 0 5 27 8 17 12 14 24 7 19 -3
## -5 -4 -19 -14 -26 -12 -7 11 23 22 21 -21 -27 -28 -30 -15 -17 -9
## -6 18 9 10 20 26 16 15 13 -8 -23 -24 -18 -10 -16 -13 -11 -20
## 29 28 45 44 43 32]
## =====================================
## Number of unique data for SalesReturnReason is 7
## unique data for SalesReturnReason is [nan 'DISLIKE ' 'COLOR' 'SIZE' 'FIT' 'PATTERN ISSUE' 'DAMAGED / DEFECTED ']
## =====================================
## Number of unique data for Price is 287
## unique data for Price is [1710 1614 1805 2376 3138 3424 1900 3329 2389 2471 1424 1280 4269 1519
## 1995 2090 2186 4757 3043 5710 1962 4281 2470 3423 3614 1662 1186 948
## 2852 900 2567 2138 1233 1090 1569 805 1567 995 2661 852 757 1043
## 1283 833 1019 614 586 3233 1329 2043 1852 2133 3710 1480 3559 2756
## 1804 1975 2668 1064 3615 902 2371 1234 1566 1134 2757 985 2173 866
## 2281 3519 3805 1535 886 1138 2262 1081 1129 712 2033 1067 1633 662
## 905 1281 1712 710 2329 1521 1141 807 1046 1045 947 1757 950 855
## 1331 854 1183 1426 1282 974 617 1321 1495 879 781 1664 2076 759
## 939 1091 903 842 740 1236 1188 1086 2074 2140 666 1137 1147 1807
## 1616 988 1093 1522 2662 4712 2448 2354 3486 1788 1694 2071 1882 4241
## 3297 2165 3769 3392 1976 1646 3108 2825 1599 1175 1080 844 1741 1835
## 2542 2259 2024 3014 1033 892 975 1505 1410 1552 5657 3675 3203 2731
## 3733 2052 1787 3580 1316 2349 3581 1222 1929 3109 6127 1074 2166 2637
## 797 986 1601 750 3391 1223 2447 1268 1864 1678 2425 5201 3453 3078
## 2798 2144 1630 3266 2636 4582 5958 3023 3390 4995 2381 2289 2014 1830
## 1739 2105 2197 1922 3299 4858 3206 3022 4124 3665 1647 583 2106 1050
## 1371 1143 567 1271 1507 706 1036 1784 867 1464 1005 3115 2307 1035
## 1127 2564 894 1509 965 1738 4052 2931 3574 2747 3298 1968 1555 681
## 942 1958 1771 1564 1235 2251 0 799 1957 2472 1130 896 978 893
## 2323 2421 2226 3488 2214 1935 2032 3197 3585 4845 1858 2130 1990 3024
## 4583 5133 5959 1648 3483 3296 5278]
## =====================================
## Number of unique data for Amount is 836
## unique data for Amount is [ 1710 -1614 -1805 2376 3138 1614 3424 -1900 1900 1805
## 3329 2389 2471 1424 1280 -2376 -4269 -1519 1995 2090
## 2186 1519 4269 4757 -4757 3043 -3329 -2090 5710 -1995
## 4752 -3138 -3424 7128 -2186 -1424 3990 7980 1962 -2471
## 4281 -2470 8538 -1962 3610 -1710 -3423 -2389 -3614 3423
## 1662 9504 1186 948 2852 2372 900 2567 2138 1233
## 1800 1090 -1186 -2372 -900 4744 3558 1569 3614 805
## 1567 995 2661 852 757 2180 1896 6276 2470 -1567
## -948 6848 1043 -805 1283 -2852 7116 833 1019 614
## 586 3233 -3233 4942 0 -1662 -1233 1329 5930 32022
## 3228 2043 1852 14256 19008 40392 28512 33264 11880 57024
## 16632 45144 -4752 -7128 -2043 -1852 -11880 -9504 2133 -45144
## -33264 -61776 -28512 -16632 -2567 -2661 -995 26136 54648 52272
## 49896 4372 -49896 -64152 -66528 -71280 -35640 -40392 -21384 -14256
## 3710 5322 6086 -3610 1480 3559 -2756 1804 -3710 -5415
## 4180 5415 -3043 -4281 7413 -5710 6658 -6276 10272 2756
## -1975 -1804 2668 3800 3608 1064 3420 19855 32490 7220
## 16245 -6848 10830 18050 39710 43320 36100 14440 46930 28880
## 9987 3615 6466 902 1975 2371 1234 -3038 21660 4778
## 5700 1566 1134 -2371 2757 985 -2173 866 -985 17076
## 12807 -3800 2281 3519 -3519 -2133 3805 -1535 -886 3924
## -2281 1138 2262 1081 1129 -1043 712 2033 2086 1067
## 1633 662 -3805 7038 38016 21384 23760 35640 30888 -5322
## -6086 -7220 42768 -19008 -54648 -57024 -42768 -23760 9025 4266
## -8538 4562 1535 886 -6658 -833 905 1281 -852 10557
## 3414 1712 2466 710 -38016 -30888 -26136 -47520 2329 -1090
## 7610 -3420 1521 3324 17595 -7038 8562 5985 -5985 13680
## 6840 -1329 47520 64152 1141 807 2092 1045 947 1757
## 950 -1138 855 1331 854 -1183 1183 5704 5130 30780
## 8550 -3990 5134 -4180 13316 6558 8302 9488 14232 23720
## 20162 11860 3270 1704 3408 21348 18976 2556 1426 15200
## 15418 2706 1282 1948 4360 1610 4525 617 3514 5271
## 2662 2642 4563 44109 6268 1495 879 8310 781 1664
## 4510 3416 4740 3129 9387 2076 759 8050 10224 5964
## 8855 12075 4025 6258 4172 10428 1878 10674 5450 6270
## 4750 7600 1091 7637 3273 4932 7630 3042 903 842
## 2850 740 3792 1236 5705 1188 939 10900 6540 1046
## 3135 2566 2844 2282 11990 10450 1894 2182 3700 1086
## 2074 974 2140 666 6084 1137 2274 1147 7698 1807
## 1616 4701 8720 4260 4500 988 3328 1093 1522 11473
## 2415 7301 15960 11970 19028 23785 14271 9514 8744 15302
## 10930 12540 8360 9500 14630 1990 1810 9126 -710 -12807
## -1521 26632 79896 -4372 50540 -3615 -2662 106920 104544 -2848
## -6466 -2757 -7610 11415 4712 2448 2354 3486 1788 1694
## -1694 2071 1882 4241 3297 -3297 2165 3769 6594 3392
## 1976 1646 -3769 -1976 -2354 4708 3952 3108 -2448 -4708
## -2071 -3486 -1882 9891 6216 7062 3764 -6594 -2165 4330
## -1788 2825 7538 -3108 -1646 -1599 -3392 3576 1175 14040
## 3240 4142 1080 844 2160 1741 1835 -1175 1599 2542
## 6972 6213 2259 5364 2024 3014 1033 2350 892 975
## 1505 -4241 8482 -1410 -1552 5657 3675 1410 4896 -5657
## 7152 5646 -1505 1552 10728 7344 -4330 6495 3203 -4712
## 2731 -2542 3733 -3733 -3576 -2024 -7062 2052 -2052 4938
## 1787 3580 -6972 6784 1316 -5364 3388 2349 -3581 -2349
## -3014 16485 11307 11770 -1222 1222 -1929 3292 9416 -2259
## 3109 6127 5650 -2825 -1074 -5650 9424 3574 2166 2637
## 12426 797 10355 5928 9880 1784 10176 986 1601 750
## 8284 -2166 -6127 -3764 11314 3391 -4142 18820 11292 1223
## 4518 7528 -1787 -5646 -8482 2447 -3109 13174 9410 -3203
## 35758 41404 -1268 1268 -3580 1864 1678 2425 5201 -2425
## 3453 3078 -1678 2798 2144 -1864 1630 -3453 3266 13188
## 3581 10458 2636 -3391 20710 33136 91124 45562 16568 4582
## 5958 3023 -7528 -11292 3390 -4995 2381 2289 21186 -3390
## -2014 -2289 -21186 16023 4995 1830 2014 1739 2105 2197
## 1922 -1830 3299 -2105 4858 -1922 -2197 -3206 -4582 3022
## 3206 -3299 4124 -4858 -1739 4578 3665 5490 1647 -1188
## -3023 -4124 3660 6867 -4578 -5958 -1647 583 2106 1050
## 1371 1143 567 1271 -1507 2100 -3022 706 1036 867
## 1464 -1601 1005 1507 -4394 -2381 4394 3115 2307 1035
## -1035 1127 4320 2564 -1080 3478 894 -3665 4210 -1509
## 1509 9990 14985 6315 4028 965 -4518 18300 9156 8420
## 1738 -583 14124 9716 4052 6598 -2637 10525 -11770 2931
## 15056 3844 9164 16478 8660 -2447 37640 45168 48932 -9424
## 6780 -6784 2747 4762 -9416 1929 6218 -3298 1074 1968
## 1555 681 14136 2820 1688 -662 -3115 -1555 942 -5201
## -2144 -1630 1958 1771 -2931 7904 1564 6046 -3660 -1235
## -2251 -6213 14497 -6780 -4698 3298 799 25894 101222 75328
## 1957 -2472 35310 -6042 6042 18832 9150 22584 -14124 2251
## 6777 1130 896 978 893 -978 -4896 17430 13568 9036
## -2731 2323 2421 2226 3488 -2214 -2421 7320 1935 2032
## 7263 3197 4452 3585 4646 -1935 4845 -1858 2130 -2130
## -2323 -2226 12254 10825 3024 -3024 4583 -4583 5133 -5959
## 5959 -2564 -5133 1648 2472 -3483 -2106 7330 18325 7143
## 13734 5128 10266 3296 5278 -7538]
## =====================================
## Number of unique data for SaleExclGST is 2477
## unique data for SaleExclGST is [ 1368. -1614. -1805. ... 4767. -1512. -3208.]
## =====================================
## Number of unique data for GSTP is 5
## unique data for GSTP is [ 5 17 0 6 9]
## =====================================
## Number of unique data for GST is 758
## unique data for GST is [ 68 -81 -90 101 157 81 171 -95 95 90 120 166
## 66 366 63 124 -63 36 32 -726 -76 -83 83 100
## 105 77 76 406 43 -508 508 56 238 109 -238 726
## 152 -166 -56 110 -105 286 119 -100 65 190 -119 -157
## 70 86 -171 -120 285 89 -109 104 61 -50 200 400
## 59 -40 -73 -65 334 -77 60 71 80 -124 85 142
## 150 73 214 107 87 94 -86 -62 111 -111 99 84
## 1451 116 -334 40 181 -68 -43 -60 108 -66 284 -366
## -84 300 380 47 62 55 143 118 180 45 64 31
## -406 -59 -118 -45 236 -36 178 177 78 50 67 72
## 39 54 38 156 82 115 97 -78 -47 342 52 -143
## 160 -99 354 42 51 29 248 0 34 113 102 295
## 1593 -31 202 -180 37 -51 93 -101 -96 69 -33 145
## 126 -93 -55 158 -271 162 209 -30 146 617 270 161
## 544 128 -126 -152 125 -150 305 112 371 -286 356 218
## 149 475 -107 347 333 -156 -67 332 -64 -110 513 345
## 74 -190 163 91 -214 -94 -80 188 133 1452 27 -61
## -49 141 314 247 893 1462 325 731 48 203 -342 -181
## 186 154 486 810 1782 1944 1620 324 650 1624 1787 2112
## 812 1300 487 271 79 498 129 219 30 -71 49 137
## 41 -69 -98 -42 98 -38 57 174 1045 348 1393 264
## 75 26 -54 46 -25 -37 -102 1016 551 210 -52 138
## -128 117 96 92 88 -104 28 292 58 -314 732 360
## 25 22 217 257 172 2904 2178 481 363 114 176 -176
## -363 -44 667 -72 -114 144 193 184 346 53 33 435
## 103 308 352 329 653 -300 690 317 243 182 -3 -39
## 542 581 250 249 290 -290 -267 -87 122 267 -91 725
## 272 233 -113 -653 -144 -35 326 311 344 676 -75 -27
## -32 -193 165 -153 330 -1451 451 228 1306 -9 -581 534
## 254 -70 -284 -356 -233 -333 -254 123 369 -29 135 -58
## 404 -141 -88 -117 240 1083 130 361 499 379 189 106
## -123 722 246 381 -203 416 -53 140 173 472 -617 -46
## -48 153 616 -246 -137 428 216 -74 223 299 -57 -79
## -275 -41 -163 -718 -139 -142 -133 684 -26 -82 -249 1161
## 282 167 -5 1663 582 915 1247 1414 2245 283 237 -169
## 602 132 147 179 159 199 -116 588 175 -240 121 244
## -158 1077 -200 -326 630 721 -209 599 449 -154 -135 328
## 139 44 289 -2177 -7 1198 3595 -376 -6 2274 697 -20
## -16 24 -10 -4 -22 -18 -8 -182 -19 304 230 2177
## 274 832 1164 3742 3659 364 197 394 316 -85 303 523
## -162 -381 457 211 198 -198 226 396 204 -226 131 -147
## -282 593 261 297 192 -336 -168 168 -130 -125 164 221
## -106 -103 -146 224 -186 -138 -178 215 -204 586 -89 169
## 337 194 212 -216 -192 407 424 -167 127 206 155 -112
## 429 339 -131 258 322 644 309 -260 418 234 195 -173
## 386 -283 384 -224 -297 148 288 35 -418 -129 136 170
## 336 -127 989 260 452 706 -339 1234 480 -136 312 353
## -170 565 373 255 296 318 187 -312 235 561 1129 678
## 801 368 -368 -288 -509 -187 -159 395 -250 229 183 1931
## 2236 277 -242 494 554 746 -293 -188 439 225 293 1056
## 1690 4647 2324 845 634 412 -165 -189 437 536 208 262
## 376 355 -355 -262 -255 -149 -437 -305 -177 335 -206 372
## -121 -202 -272 -115 461 433 -234 331 -536 375 268 251
## -148 -237 231 -247 259 -184 265 -412 627 -289 899 253
## 445 497 350 568 -371 1647 239 577 276 191 530 398
## -248 -208 752 -174 594 -461 663 415 185 -424 874 390
## -259 903 242 709 410 692 425 2033 2439 2642 610 -565
## 220 -277 -211 -122 -215 -140 294 525 -231 -396 -452 -195
## -108 -97 7 14 205 151 2 -185 280 275 -330 -373
## 298 696 597 -329 -610 427 1088 4252 3164 1483 227 -544
## -199 791 576 1220 474 -257 -294 509 1046 814 -407 -221
## -183 510 -269 256 847 735 222 -212 213 -331 462 -462
## -191 660 1649 450 196 393 351 865 618 323 824 924
## -220 897]
## =====================================
## Number of unique data for DiscPer is 258
## unique data for DiscPer is [ 0.000e+00 2.500e+01 3.000e+01 1.500e+01 3.500e+01 2.000e+01
## -4.740e+01 1.000e+01 8.650e+00 8.600e+00 4.000e+01 5.000e+01
## 2.020e+01 -4.750e+01 5.000e+00 1.050e+01 -2.780e+01 6.000e+00
## 8.230e+00 9.100e+00 8.940e+00 2.100e+01 2.765e+01 -4.800e+01
## -3.000e+01 1.501e+01 -5.350e+01 -1.225e+01 2.870e+01 -2.760e+01
## 2.850e+01 3.500e+00 -1.000e+01 5.100e+01 4.500e+00 3.200e+01
## 3.600e+01 -2.700e+01 -1.420e+01 -1.100e+01 2.400e+01 -1.150e+01
## -1.900e+01 -2.000e+01 4.800e+00 1.300e+01 9.400e+00 -4.730e+01
## 4.200e+01 8.000e+00 7.900e+00 8.750e+00 9.000e+00 9.300e+00
## 2.630e+01 1.700e+01 1.000e+02 2.800e+00 1.000e+00 -2.750e+01
## 1.900e+01 1.400e+01 6.800e+00 1.200e+01 8.000e-01 -2.775e+01
## 1.250e+01 -1.105e+02 -4.400e+01 2.000e+00 -4.735e+01 1.100e+01
## 1.550e+01 1.340e+01 5.500e+00 -5.000e+00 -7.500e+00 -9.700e+00
## -1.600e+01 -8.100e+00 -3.150e+01 -3.600e+01 1.900e+00 -4.050e+01
## -2.550e+01 -1.550e+01 -2.335e+01 -4.430e+01 -1.107e+02 -4.745e+01
## 1.870e+01 1.110e+01 1.590e+01 2.220e+01 2.230e+01 2.198e+01
## -4.709e+01 -4.700e+01 7.000e+00 2.235e+01 2.240e+01 2.250e+01
## 3.000e+00 1.325e+01 2.225e+01 1.260e+01 -8.500e+01 -3.529e+01
## 4.000e+00 2.080e+01 -3.550e+01 -6.200e+01 -8.300e+01 -1.450e+01
## -1.200e+01 4.400e+00 1.650e+01 4.600e+01 -2.000e+00 4.300e+00
## 9.500e+00 -2.000e-01 4.900e+00 8.700e+00 5.000e-01 2.700e+00
## -1.120e+01 -9.000e+01 -7.000e+01 -6.500e+01 -6.000e+01 -3.010e+01
## -1.500e+01 -1.950e+01 -6.300e+00 -6.500e+00 -6.650e+01 -6.400e+01
## -6.300e+01 -8.350e+01 -5.500e+00 1.570e+01 1.600e+01 -2.755e+01
## -5.170e+01 -4.300e+01 8.700e+01 7.000e+01 5.625e+01 2.300e+01
## 3.330e+01 4.600e+00 4.780e+00 6.875e+01 2.310e+01 -1.000e+00
## -1.100e+00 -1.500e+00 -2.500e+01 -1.000e-01 -2.400e+01 -4.900e+01
## -1.200e+00 -2.500e+00 1.420e+01 -1.400e+01 -2.800e+01 1.850e+01
## 3.300e+01 1.490e+01 1.950e+01 1.980e+01 1.350e+01 1.450e+01
## 1.750e+01 1.330e+01 -2.830e+01 2.520e+01 2.460e+01 2.800e+01
## 3.475e+01 -1.170e+01 2.880e+01 -2.900e+01 -4.850e+01 -2.350e+01
## 6.750e+01 -1.800e+01 3.300e+00 -3.100e+00 6.640e+01 6.210e+01
## 5.960e+01 -4.550e+01 -2.340e+01 -3.200e+00 -2.420e+01 -2.820e+01
## -1.300e+01 1.430e+01 -8.000e-01 4.570e+01 3.400e+00 -2.410e+01
## 1.440e+01 4.670e+01 1.410e+01 -4.840e+01 5.350e+01 2.340e+01
## 8.450e+00 5.375e+01 5.340e+01 5.400e+01 5.335e+01 4.450e+01
## 6.300e+01 4.800e+01 5.900e+01 5.500e+01 -1.850e+01 4.380e+01
## 6.410e+01 6.000e+01 4.970e+01 4.510e+01 4.610e+01 6.610e+01
## 6.700e+01 5.370e+01 4.840e+01 4.940e+01 6.500e+01 5.930e+01
## -1.130e+02 5.860e+01 -5.800e+00 -4.500e+01 6.650e+01 6.660e+01
## 3.590e+01 2.820e+01 -1.650e+01 -1.127e+02 -5.000e+01 -4.830e+01
## 7.800e+01 -8.500e+00 -8.000e+00 1.920e+01 1.910e+01 1.540e+01
## 9.850e+01 -6.750e+00 3.140e+01 5.940e+01 4.750e+01 1.010e+01]
## =====================================
## Number of unique data for DiscAmount is 1000
## unique data for DiscAmount is [ 0.0000e+00 4.5100e+02 7.1300e+02 5.9800e+02 3.1400e+02 9.4100e+02
## 6.3200e+02 4.7500e+02 7.6500e+02 8.6500e+02 1.1360e+03 2.9900e+02
## 4.9900e+02 2.3800e+02 3.9900e+02 2.4700e+02 9.9900e+02 5.4200e+02
## 1.2810e+03 1.1650e+03 2.9400e+02 2.7100e+02 2.0000e+02 1.9600e+02
## 1.0270e+03 1.6700e+02 1.6600e+02 2.0600e+02 2.8500e+02 2.5000e+02
## 5.0000e+02 6.6500e+02 3.5600e+02 8.3200e+02 1.0980e+03 1.1980e+03
## 9.5000e+02 5.6500e+02 1.1300e+03 1.1880e+03 -8.3200e+02 1.9980e+03
## 6.9800e+02 1.2640e+03 6.8700e+02 7.4700e+02 -6.3200e+02 1.4980e+03
## 1.4940e+03 7.3200e+02 8.0700e+02 6.6600e+02 1.5300e+03 1.0670e+03
## -7.3200e+02 -6.9800e+02 -1.1980e+03 1.7130e+03 -3.5600e+02 4.3700e+02
## 2.4200e+02 6.0000e+02 1.0900e+02 1.5200e+02 5.4600e+02 9.0000e+01
## 5.3700e+02 1.0000e+02 5.1400e+02 6.4000e+02 1.0700e+03 8.5600e+02
## 1.8000e+02 5.9700e+02 4.0400e+02 7.8400e+02 3.2800e+02 3.5000e+02
## 4.2300e+02 1.0800e+02 1.5600e+02 5.2200e+02 3.7100e+02 3.5800e+02
## 5.7000e+02 1.9000e+02 1.6100e+02 1.4400e+02 1.8200e+02 3.2300e+02
## 1.1890e+03 9.1200e+02 3.8700e+02 7.1900e+02 9.2600e+02 1.9860e+03
## 3.2490e+03 7.2200e+02 1.6240e+03 2.3780e+03 1.5690e+03 1.1940e+03
## 9.0200e+02 3.4200e+02 1.0800e+03 1.8000e+03 3.9600e+03 4.3200e+03
## 3.6000e+03 7.2000e+02 1.4440e+03 3.6100e+03 3.9710e+03 4.6930e+03
## 1.8050e+03 2.8880e+03 1.0830e+03 -4.7500e+02 4.5600e+02 3.6100e+02
## 4.8400e+02 6.8500e+02 1.8100e+02 5.9400e+02 2.4000e+02 1.8600e+02
## 1.2400e+02 2.7200e+02 3.5900e+02 -3.5900e+02 2.3900e+02 6.8800e+02
## 2.8600e+02 -6.4900e+02 4.9700e+02 2.4800e+02 6.4900e+02 2.0900e+02
## 5.7300e+02 2.1900e+02 5.2500e+02 4.7100e+02 2.8400e+02 1.3300e+02
## 1.2300e+02 6.4500e+02 4.9300e+02 -1.5200e+02 4.6500e+02 1.1300e+02
## 1.2600e+02 1.2200e+02 7.5800e+02 2.5300e+02 1.0110e+03 8.3600e+02
## 1.3700e+02 2.7400e+02 7.8000e+01 1.7100e+02 -1.8500e+02 -1.8000e+02
## 1.0200e+02 1.1150e+03 -1.1150e+03 9.2000e+01 3.0800e+02 1.6400e+02
## 1.8500e+02 6.8000e+01 1.2100e+02 2.5610e+03 1.0250e+03 3.8000e+02
## 2.0800e+02 2.4500e+02 2.5800e+02 9.8000e+01 1.0300e+02 2.0500e+02
## 4.1800e+02 -4.8000e+02 1.1340e+03 1.1900e+02 7.1700e+02 7.0000e+01
## 1.4000e+02 3.5000e+01 -1.2200e+02 6.5600e+02 4.3000e+02 9.5000e+01
## 6.1000e+01 -9.5000e+01 7.9800e+02 1.5800e+02 1.8300e+02 9.5100e+02
## 2.2200e+02 5.7100e+02 3.3300e+02 1.7700e+02 1.4250e+03 1.7080e+03
## 1.3700e+03 7.8800e+02 3.2000e+02 8.7400e+02 6.8400e+02 2.1500e+02
## -2.1900e+02 3.0400e+02 4.2700e+02 -1.9600e+02 -5.9800e+02 1.6630e+03
## 1.6640e+03 -7.6500e+02 -8.6500e+02 1.0650e+03 -6.8700e+02 3.4100e+02
## 3.7000e+02 3.1300e+02 5.4000e+02 2.7000e+02 1.5700e+02 2.1300e+02
## 1.0450e+03 2.1340e+03 1.0930e+03 1.2360e+03 6.4200e+02 3.5500e+02
## 5.2800e+02 1.6800e+03 9.0250e+03 7.2200e+03 5.4150e+03 8.5400e+02
## 7.6100e+02 6.2800e+02 1.4260e+03 1.9800e+02 4.9000e+02 2.1400e+03
## 1.7120e+03 -4.2700e+02 -3.6100e+02 -3.4200e+02 -3.9200e+02 7.0400e+02
## -4.3700e+02 6.0900e+02 3.9200e+02 -3.9900e+02 -9.9000e+01 -7.6000e+01
## 5.3300e+02 5.8900e+02 5.1300e+02 2.2000e+01 6.3000e+02 -1.9000e+02
## -2.0000e+02 4.6000e+02 -3.3300e+02 3.4300e+02 3.0600e+02 -1.7100e+02
## 2.0400e+02 2.5600e+02 1.0700e+02 2.9000e+02 8.3000e+01 4.2800e+02
## 9.9000e+01 -1.6600e+02 -1.3700e+02 -2.0500e+02 -4.2800e+02 2.7000e+01
## -5.4200e+02 3.3200e+02 2.0900e+03 -9.5000e+02 7.9000e+02 5.0600e+02
## 6.2700e+02 1.9030e+03 5.8200e+02 -2.3800e+02 1.8920e+03 4.7600e+02
## -2.0900e+02 3.4600e+02 2.8900e+02 7.1400e+02 4.0900e+02 -8.5400e+02
## 7.8500e+02 4.9400e+02 4.8100e+02 3.4000e+01 4.1900e+02 -7.0500e+02
## 1.1420e+03 1.5800e+03 2.9300e+02 3.3400e+02 1.0390e+03 5.0000e+01
## 1.1970e+03 2.2700e+02 -2.7000e+02 4.7000e+02 3.2700e+02 1.2320e+03
## 7.1200e+02 -3.2700e+02 -6.8500e+02 1.9020e+03 1.5220e+03 1.4080e+03
## 3.3900e+02 3.0700e+02 5.6600e+02 7.2800e+02 4.2000e+02 -7.1300e+02
## 2.8550e+03 -2.8550e+03 -9.0200e+02 -2.1340e+03 -1.6640e+03 1.0400e+03
## 4.6600e+02 -6.4000e+02 8.2000e+02 -8.5600e+02 -7.6100e+02 -4.5600e+02
## -4.1800e+02 -4.9400e+02 3.5200e+02 -5.1300e+02 2.2400e+02 2.1700e+02
## 1.8800e+02 -2.9900e+02 -2.2200e+02 1.7900e+02 2.8000e+02 2.2900e+02
## 2.6500e+02 6.4000e+01 3.6000e+01 7.2000e+01 3.1900e+02 6.0000e+01
## 1.2800e+02 4.1500e+02 2.3400e+02 4.8500e+02 6.9500e+02 7.2700e+02
## 3.7700e+02 4.3100e+02 -2.7000e+01 -4.3000e+01 6.6400e+02 -3.1400e+02
## 5.5600e+02 4.6100e+02 3.6500e+02 -4.6000e+02 3.6600e+02 3.8800e+02
## -5.2800e+02 -4.9900e+02 7.2500e+02 4.2100e+02 -4.0900e+02 5.6900e+02
## 3.7500e+02 2.2800e+02 -6.6600e+02 2.6600e+02 1.7600e+03 1.3320e+03
## -3.8000e+02 1.1040e+03 7.8900e+02 3.7300e+02 3.7900e+02 3.7800e+02
## 9.9800e+02 1.2000e+01 3.9300e+02 1.1290e+03 5.2700e+02 1.0260e+03
## 7.2600e+02 9.7100e+02 6.4700e+02 3.7200e+02 8.8000e+02 1.0950e+03
## 4.3600e+02 1.5400e+02 2.1400e+02 9.9700e+02 1.3200e+02 7.6000e+02
## 9.4700e+02 2.3100e+02 1.1390e+03 3.8500e+02 3.8400e+02 3.1700e+02
## 5.4700e+02 1.3120e+03 -6.5600e+02 1.2000e+02 2.9100e+02 8.5000e+02
## 3.4000e+02 6.1560e+03 1.7100e+03 2.1380e+03 1.0000e+01 3.0000e+01
## 3.4400e+02 2.3500e+02 1.6900e+02 3.4500e+02 5.1900e+02 1.0600e+02
## 2.5000e+01 -7.2700e+02 -6.8800e+02 6.6100e+02 1.0560e+03 3.5190e+03
## 2.4630e+03 -2.8400e+02 -2.4630e+03 -3.1300e+02 6.5400e+02 -1.4940e+03
## 1.4200e+03 8.4000e+02 2.8300e+02 2.6630e+03 7.9900e+03 1.2050e+03
## 1.1000e+03 1.3650e+03 8.5500e+02 2.8200e+02 5.0540e+03 5.0500e+02
## 6.9000e+01 4.5500e+02 3.7600e+02 6.3700e+02 8.5200e+02 2.1200e+02
## -6.9000e+01 4.5900e+02 6.7000e+01 1.7200e+02 1.4500e+03 1.1550e+03
## 4.0200e+02 3.1000e+01 6.6000e+01 9.4500e+02 3.0000e+00 5.4000e+01
## 7.5000e+01 9.4000e+01 -2.7200e+02 9.0000e+00 3.8300e+02 7.5200e+02
## 1.0240e+03 5.2400e+02 2.6200e+02 2.6300e+02 1.5000e+03 1.4000e+03
## 4.6300e+02 6.9200e+02 1.4800e+02 7.4200e+02 2.9700e+02 1.1100e+02
## 2.6900e+03 6.6900e+02 6.5800e+02 7.3000e+02 8.6700e+02 5.7000e+01
## 7.3000e+01 7.3460e+02 1.2390e+03 1.0310e+03 1.0020e+03 2.4900e+02
## 3.3260e+03 2.4950e+03 -4.3440e+03 5.1200e+02 -2.3100e+03 -6.2500e+02
## -1.2400e+03 -1.0440e+03 1.2840e+03 8.0800e+02 4.0700e+02 8.0900e+02
## 1.6160e+03 -7.1000e+02 6.1400e+02 4.0000e+01 1.2940e+03 1.7000e+01
## -3.2300e+02 -6.4700e+02 -5.4000e+02 1.4600e+02 1.5300e+02 2.0000e+00
## 4.9500e+02 -6.6000e+01 -5.9500e+02 1.8900e+02 2.2830e+03 -1.2320e+03
## 8.2400e+02 1.2200e+03 6.2600e+02 5.9300e+02 -5.9300e+02 6.5900e+02
## 3.1100e+02 -1.3830e+03 1.0400e+02 3.9500e+02 4.3300e+02 4.1400e+02
## 3.0300e+02 5.0000e+00 1.1540e+03 -7.2500e+02 3.5300e+02 5.2300e+02
## 3.2500e+02 6.3600e+02 4.0000e+02 1.6960e+03 4.1600e+02 5.6000e+02
## 3.6800e+02 3.3600e+02 2.8800e+02 6.4100e+02 5.7700e+02 -6.9700e+02
## 7.2100e+02 8.7200e+02 -5.1000e+02 3.2000e+01 5.9000e+01 5.3000e+01
## -2.8100e+02 6.4600e+02 -3.3600e+02 -3.5200e+02 2.6800e+02 1.6480e+03
## 8.0000e+01 -3.1000e+02 5.3600e+02 8.0000e+02 -3.2500e+02 1.6000e+02
## 1.7600e+02 2.9600e+02 -1.2800e+03 -3.5500e+02 -3.0400e+02 -9.3000e+02
## 7.0600e+02 -6.2000e+02 8.5700e+02 8.2000e+01 -4.1460e+03 1.2240e+03
## 5.0900e+02 1.1310e+03 -1.2820e+03 -5.9000e+02 7.4100e+02 -1.7000e+01
## -2.2000e+01 -1.5000e+01 -2.8200e+02 -1.4000e+01 -1.2000e+01 -1.0000e+01
## -1.1000e+01 -2.1000e+01 9.6000e+01 -3.2000e+02 -2.4000e+01 -1.8000e+01
## 2.4000e+01 -3.5000e+01 -3.4000e+01 -1.9000e+01 -1.4400e+02 -3.2900e+02
## -3.0000e+01 -3.2000e+01 -3.8000e+01 -3.0000e+00 -3.1600e+02 4.5000e+01
## -4.5000e+01 -6.9500e+02 -3.6000e+01 2.5400e+02 -7.4200e+02 -2.9000e+01
## 2.6700e+02 -1.8200e+02 -4.2200e+02 1.9100e+02 2.0700e+02 3.3100e+02
## 2.7700e+02 -2.1500e+02 -2.5600e+02 1.9700e+02 1.9400e+02 2.0200e+02
## 2.2600e+02 3.9000e+01 -2.3000e+01 -2.0200e+02 1.7500e+02 -1.3000e+01
## -4.3300e+02 7.6000e+01 4.2400e+02 9.4200e+02 4.4000e+02 1.6490e+03
## 1.8850e+03 1.5080e+03 -5.9100e+02 9.8100e+02 3.6700e+02 4.1000e+01
## -6.7800e+02 -6.7300e+02 1.1870e+03 -6.1900e+02 -6.6200e+02 -7.5400e+02
## -5.3800e+02 -5.6500e+02 6.9700e+02 8.4800e+02 1.1500e+02 -1.8850e+03
## -8.5000e+02 8.0100e+02 -9.8600e+02 7.0700e+02 5.2900e+02 7.9200e+02
## 2.3560e+03 1.7430e+03 -7.0700e+02 -7.9000e+02 9.1900e+02 6.7800e+02
## 5.5100e+02 -1.4110e+03 3.3000e+02 3.4900e+02 1.8820e+03 7.5300e+02
## 3.5760e+03 4.1400e+03 -1.0360e+03 1.4840e+03 5.1800e+02 1.8700e+02
## -5.7500e+02 -2.0700e+02 4.8900e+02 1.1000e+01 -2.6000e+01 1.9000e+01
## 2.6900e+02 6.3900e+02 2.4300e+02 8.4500e+02 -1.9300e+02 1.7800e+02
## -1.0500e+02 -2.2400e+02 -5.3000e+01 2.0300e+02 -2.0440e+03 -1.9110e+03
## -2.5460e+03 -1.6660e+03 4.4300e+02 -5.5700e+02 6.2100e+02 -2.4200e+02
## -1.0900e+02 3.3500e+02 -2.9400e+02 -2.7600e+02 -6.2900e+02 1.3000e+01
## -9.0000e+00 -2.1200e+02 -1.0900e+03 -1.1010e+03 -1.0000e+00 3.8000e+01
## -2.0510e+03 2.1000e+01 5.1000e+01 -3.3000e+01 -6.9200e+02 -7.0300e+02
## 1.5000e+02 -5.8800e+02 3.3000e+01 4.6800e+02 3.1000e+02 1.4000e+01
## -2.7700e+02 1.1960e+03 -1.2570e+03 -9.9700e+02 -2.2100e+02 -5.8100e+02
## 3.0900e+02 -1.2770e+03 -5.5400e+02 -5.5600e+02 -5.3500e+02 1.7000e+02
## -1.8890e+03 -4.9300e+02 9.3700e+02 -6.7600e+02 1.2720e+03 1.4140e+03
## -3.4900e+02 -2.3560e+03 3.1060e+03 4.9700e+03 1.3669e+04 6.8340e+03
## 2.4850e+03 1.8640e+03 1.0880e+03 1.0580e+03 -1.7130e+03 -7.3400e+02
## -2.3990e+03 -1.0280e+03 -8.0800e+02 -1.3730e+03 -4.0400e+02 -1.8840e+03
## -1.9760e+03 -8.4700e+02 7.9100e+02 3.2960e+03 4.9430e+03 1.4830e+04
## -1.1860e+03 -1.9100e+02 -2.2600e+02 -1.4830e+04 5.6080e+03 -4.9430e+03
## -6.5900e+02 7.2900e+02 7.4900e+02 -4.1500e+02 6.1300e+02 1.3940e+03
## 1.6500e+02 7.0500e+02 -1.4500e+03 6.6000e+02 1.0420e+03 5.7600e+02
## 2.2910e+03 2.9790e+03 2.4290e+03 1.6950e+03 1.6030e+03 -1.5100e+03
## 8.4900e+02 8.2600e+02 1.0130e+03 7.7900e+02 5.6100e+02 -1.4150e+03
## 9.7200e+02 2.4980e+03 5.4500e+02 2.1180e+03 4.0300e+02 1.5320e+03
## 1.0600e+03 1.1920e+03 7.1600e+02 9.0700e+02 2.5100e+02 -8.4900e+02
## -9.1900e+02 -6.3600e+02 -2.7350e+03 3.0640e+03 1.0740e+03 8.5000e+01
## -5.0800e+02 -1.0150e+03 3.0200e+02 -8.3500e+02 5.6400e+02 3.2100e+02
## -1.8300e+03 -1.6000e+03 -1.6600e+03 7.3600e+02 4.3400e+02 3.8900e+02
## -6.8000e+02 -4.6200e+02 7.9000e+01 -1.3570e+03 3.1500e+02 5.8300e+02
## 3.7640e+03 4.5170e+03 4.8930e+03 5.6700e+02 1.8450e+03 1.7480e+03
## -1.8860e+03 2.2980e+03 6.3000e+01 3.6400e+02 2.1440e+03 -6.5700e+02
## -1.3180e+03 -5.6300e+02 5.0800e+02 1.2250e+03 -2.3140e+03 -2.9330e+03
## -3.7300e+02 8.2800e+02 8.6600e+02 -1.0980e+03 2.1000e+02 -1.0340e+03
## -1.0290e+03 -4.8900e+02 -1.9570e+03 -1.3380e+03 -1.8160e+03 -1.4810e+03
## -2.3000e+02 -2.3600e+02 -1.4090e+03 -2.6300e+02 -1.9730e+03 -1.3820e+03
## -1.4600e+03 -1.0770e+03 -1.2140e+03 -1.2410e+03 -2.4040e+03 -1.8910e+03
## -1.4460e+03 -1.1840e+03 -9.4100e+02 -1.5890e+03 -2.0010e+03 -1.2930e+03
## -1.0100e+03 -1.6940e+03 -1.6330e+03 -2.3700e+02 -2.2500e+02 -5.2000e+01
## -3.1100e+02 -4.6600e+02 -1.8770e+03 -1.8620e+03 -3.7600e+02 -3.5400e+02
## -4.1400e+02 -8.3400e+02 -8.8100e+02 -7.4600e+02 -7.8300e+02 -8.4800e+02
## -2.0000e+00 -2.6100e+02 -7.4000e+01 1.0460e+03 -2.1240e+03 -1.1490e+03
## -1.8820e+03 -1.1960e+03 -1.8570e+03 3.7000e+01 -2.2030e+03 -7.3000e+01
## 1.4680e+03 3.6300e+02 6.2200e+02 -1.0600e+02 -1.2600e+02 6.9100e+02
## 2.0400e+03 -1.2400e+02 -1.0700e+02 -6.2100e+02 -8.4500e+02 -1.8100e+03
## -6.0200e+02 -1.9650e+03 -2.3080e+03 7.6900e+02 3.1600e+02 5.4900e+02
## -9.0700e+02 -6.8400e+02 1.4120e+03 2.1600e+02 5.8800e+02 -1.8180e+03
## -1.2790e+03 -1.1750e+03 6.0500e+02 1.4100e+02 1.3560e+03 5.5300e+02
## -1.9770e+03 -1.7630e+03 7.7000e+01 2.0100e+02 7.3700e+02 4.1300e+02
## 1.0170e+03 5.9900e+02 2.2580e+03 -1.6480e+03 -4.8800e+02 -1.3950e+03
## 4.2380e+03 -1.3940e+03 1.3190e+03 -5.0400e+02 -7.0400e+02 2.2500e+02
## 8.3300e+02 1.2830e+03 -8.3300e+02 1.8840e+03 1.8320e+03 -1.0140e+03
## 5.5000e+02 7.7000e+02 1.3180e+03 1.1860e+03 1.1900e+03 3.5700e+02
## -7.6700e+02 1.6040e+03 1.1440e+03 8.9700e+02]
## =====================================
## Number of unique data for BarcodeDiscPer is 19
## unique data for BarcodeDiscPer is [ 20 0 15 30 10 -30 50 -20 -50 -10 60 -15 -60 -40 40 25 14 -14
## -25]
## =====================================
## Number of unique data for BarcodeDiscount is 952
## unique data for BarcodeDiscount is [ 342 0 356 1027 570 239 542 -542 712 640 475 -475
## -713 713 656 855 -1281 1281 1427 484 -484 941 -1427 323
## 950 -1027 1425 304 -427 -807 -627 -323 -656 513 1284 627
## 180 437 209 1712 399 -1712 -1235 380 247 -247 494 418
## 200 807 -342 -855 -513 -570 -239 -418 190 -1807 1900 1236
## 1093 1522 902 616 -712 1569 1807 1330 784 1426 3138 1235
## 798 -494 370 1616 -1616 -1569 1026 -380 -370 427 384 970
## -616 740 1022 -1022 1780 1855 -827 -1236 1188 1084 -1855 1045
## -1093 -1084 -1522 -1284 968 913 2375 -913 -180 -200 -3138 -1330
## -941 1378 -902 -741 -950 360 -209 -399 1334 1804 532 -304
## 1710 -190 -356 1312 1808 400 3232 -1426 -988 988 1186 741
## -1378 -970 856 -760 760 2052 -1026 783 646 -1188 567 617
## 478 -1186 492 433 -492 3424 361 456 -361 722 -784 2852
## 2139 -456 1140 1083 -722 2186 598 999 717 -598 -717 556
## 589 770 684 -589 460 266 998 -640 831 -1045 613 -998
## -613 499 -684 -831 1194 1056 2376 313 3168 199 2706 1280
## -499 -999 926 1996 2054 1113 4278 3252 1626 -1056 2112 -1194
## 3565 1196 470 1482 2388 1662 1254 5280 -2112 711 -770 -1140
## -856 592 3564 -2139 2090 14260 4991 7843 10695 12121 19251 1368
## 443 257 219 238 171 285 333 352 428 476 -285 213
## -428 -798 -592 571 -219 714 -257 2472 -238 952 -2054 -352
## -171 3078 1540 196 -213 -333 -926 85 1711 1483 2624 -1711
## -1312 666 854 761 332 704 -854 -761 -1483 -437 -704 685
## 1444 874 1760 -1540 3422 1082 541 1142 -332 1408 -1254 4372
## -2186 3080 -1808 -2090 836 -685 -874 -666 2966 1968 1370 7130
## 9982 32085 31372 1664 -1664 328 2111 2850 1197 2508 5704 2378
## 2854 -1083 4222 -1760 2280 3420 -836 3328 -2852 -328 1068 8556
## 3762 -2111 15686 4750 3800 -5704 -4278 4560 -2280 -1424 -3232 11408
## -2376 -1197 6333 706 565 650 734 621 1412 932 -706 -1412
## 1036 593 1046 508 1884 -621 1864 2118 268 353 509 523
## -990 -495 495 367 282 311 296 254 325 -367 -508 466
## -296 -353 -311 536 -734 989 866 -565 1224 -268 -866 -1046
## 1243 636 790 1299 2486 828 658 1695 715 -715 -790 4522
## -2261 753 -1243 942 2261 -536 -650 -593 -989 1131 -1131 -282
## 1073 -523 -325 1018 -1884 -480 -1018 1469 1129 1177 678 8424
## 1944 2826 2092 918 1072 -1082 894 2827 480 622 1743 823
## 2354 -942 -1036 1525 4184 -1177 1648 2824 3531 847 2356 3729
## 2258 1430 1355 1732 -1129 -828 2145 1012 1507 1433 2598 810
## 452 -636 1272 -753 -705 -776 1414 1697 1838 1130 705 1468
## 564 -254 800 -658 -823 -1697 1865 -932 -452 2860 -1272 776
## 1506 -1469 2202 1016 -1299 3246 2072 -509 2164 -847 4290 763
## 1922 1242 1639 -1525 2262 1580 -1072 -1012 -2092 -1224 -2824 -2118
## 707 2469 846 2148 -1430 -1414 395 -2145 -707 3768 904 -1790
## 990 -904 -1732 1300 3393 1214 423 -423 -611 611 1101 -964
## 1646 849 979 -979 -678 919 1271 2134 5885 2120 -894 4708
## 1696 1554 2828 -1648 1976 3486 3296 981 1066 -2134 -1743 -1271
## -981 -537 -800 1788 1882 1318 752 -2354 588 959 3108 733
## -2356 -2120 6216 3392 4240 5180 1694 2964 4940 5088 2823 176
## -1696 148 4144 1764 1176 292 -919 2448 -1882 3394 1017 -2072
## 612 1356 -564 3764 -846 -466 1602 1790 1366 -763 848 6587
## 4705 -1602 -2823 -634 634 1978 -1695 2938 503 1212 -1212 1120
## 923 -503 1399 643 489 980 3530 3956 2352 1608 -1017 -1507
## 339 1410 2562 -1554 2544 5646 687 604 806 1098 -604 659
## 632 870 1007 549 -659 -1603 680 -687 1144 256 641 269
## -632 320 -641 -1007 2291 1650 1208 1511 1603 462 -462 1237
## 1052 -1144 -269 915 1319 1394 1357 577 -1098 907 962 -1130
## 376 414 -320 -1052 -870 697 -433 -622 1374 1832 -848 -697
## 732 -577 824 -962 -594 -907 -806 -1237 -680 -549 961 834
## 2061 -1374 -1208 -414 2260 1787 842 1209 1190 1429 1263 1375
## -824 -2288 2288 -915 -376 -842 769 3432 3576 2682 1885 879
## 1464 1053 2979 686 916 1244 -1244 1833 2788 1373 -769 -1885
## -879 -916 -1511 1043 525 -588 -1394 -1650 892 2383 2035 3676
## 648 2196 4712 -1758 1512 -1429 -2035 594 1558 -2146 1153 -1833
## 2749 540 1282 4710 3064 1740 -3064 -1832 1647 -961 1264 308
## -256 -754 754 -2291 2484 -2062 512 2014 449 -1356 2526 2748
## 2104 -1319 2528 2086 1896 -449 -1043 471 -308 4236 1360 -2828
## -1242 1863 -1375 -834 -791 3160 2967 1508 -2826 -4710 -749 749
## 2036 1466 -1153 1154 4942 1504 -489 -1190 768 3672 -2036 2714
## 1656 2638 -1978 2858 -3768 964 -1263 1316 -1357 -1649 1181 2062
## 778 4242 2416 -1558 1100 2034 -778 434 3390 1830 2259 984
## 1560 -643 531 3952 782 -2196 -752 -1788 -512 -1126 933 -1318
## 3770 2898 2070 1998 -2348 1649 7766 30358 22592 978 1974 10590
## 1684 5648 2745 1126 349 207 500 -207 -349 -339 2451 226
## 188 -2827 -1355 7062 -1838 245 -471 330 2712 -1366 -2164 4328
## 2372 1059 -2451 2488 8239 3464 235 791 1162 726 -1328 -1210
## 2026 1498 580 1210 813 3630 1279 890 1076 2324 -580 1454
## -1209 4520 -1115 639 -639 -1162 -890 -3531 2144 2370 597 2032
## 4330 2825 1320 -732 2292 3624 1814 4576 -1814 -1830 1979 2199
## -1742 -933 522 -522 -714 -1373 2142 1428 1044 4122 1538 2384
## -1053 3664 -1320 -1512]
## =====================================
## Number of unique data for NetAmount is 2176
## unique data for NetAmount is [ 1436. -1695. -1895. ... 2515. -2660. 6175.]
## =====================================
## Number of unique data for PointsEarned is 139
## unique data for PointsEarned is [ 0 36 17 35 -17 -36 20 50 26 23 22 32 16 9
## 25 19 12 21 40 13 33 11 45 18 7 31 5 6
## -5 15 -25 8 14 -22 -20 -16 60 28 100 -23 -50 -14
## -13 10 38 -28 200 150 42 52 -35 -26 -18 -19 -21 -40
## 30 24 -12 -15 46 -33 75 37 72 29 80 -32 44 27
## -10 -37 48 57 70 105 87 349 122 192 262 297 471 41
## -11 54 -70 74 -30 -27 -9 -31 -29 58 -34 175 244 786
## 768 -7 -8 34 -45 51 85 55 -24 -42 65 63 56 187
## 299 821 411 149 112 53 -74 43 -43 -53 69 -65 751 559
## 49 111 -100 -66 66 140 90 215 108 39 120 84 106]
## =====================================
## Number of unique data for TaxPer is 2
## unique data for TaxPer is [6 9]
## =====================================
## Number of unique data for Cobrand Acc is 80
## unique data for Cobrand Acc is ['F/S LICENSE ' 'F/S LUXER PLAIN '
## 'F/S CAMBRIDGE EXECUTIVE ' 'F/S PRINCIPLE PLAIN '
## 'F/S PRINCIPLE CLASSIC YARN DYED' 'F/S LUXER YARN DYED '
## 'F/S PRINCIPLE SWAN ' 'F/S CAMBRIDGE CASUAL' 'PORT FOLIO (SHADES) F/S '
## 'PORT FOLIO F/S WCB ' 'F/S PORT FOLIO SHIRT YARN DYED'
## 'F/S NO IRON EVER' 'F/S DENIM ' 'F/S LUXER PLAIN WCB '
## 'F/S CAMBRIDGE SINCE 1958' 'F/S ARISTO ' 'H/S EXECUTIVE'
## 'H/S PRINCIPLE CLASSIC ' 'F/S ARISTO MASON' 'OVERDYE F/S ' 'H/S LUXER'
## 'F/S TOMORROW ' 'LICENSE H/S '
## 'F/S PRINCIPLE POPLIN LUXE MILANO' 'PRINTED F/S SHIRTS'
## 'F/S PRINCIPLE Y/DYED LUXE MILANO ' 'H/S PORT FOLIO Y/D '
## 'D STUDIO DESIGNERS SHIRT' 'OXFORD H/S' ' H/S TOMORROW'
## 'F/S AFTER HOURS ' 'CHEMBREY F/S ' 'F/S PRINCIPLE OXFORD '
## 'CHEMBREY H/S' 'AGE OF WISDOM F/S ' 'H/S PORT FOLIO WCB '
## 'PRIVILEGE F/S ' 'SEERSUCKER F/S SHIRT' 'LIGHT WEIGHT H/S '
## 'LIGHT WEIGHT F/S ' 'PORTFOLIO SATEEN F/S ' 'F/S DEAD SHIRT ' 'ITALIAN'
## 'H/S AFTER HOURS ' 'SEERSUCKER H/S SHIRT'
## 'LICENSE ' 'OXFORD LICENSE F/S' 'HERRING BONE F/S'
## 'MELANGE YARN DYED F/S ' 'FLANNEL F/S ' 'F/S SHARP CAMBRDIGE '
## 'CAMBRIDGE SHIRT' 'DOBBY F/S' 'Cotton Linen F/S '
## 'F/S ESSENTIALS MENS FORMAL SHIRTS' 'Cotton Slub F/S'
## 'OXFORD LICENSE H/S' 'F/S ACTIVE' 'PERSONALLY CAMBRIDGE'
## 'F/S D STUDIO ' 'F/S ZERO TOLERANCE ' 'F/S DOBBY'
## 'F/S HERRING BONE ' 'F/S MELANGE YARN DYED ' 'F/S LICENSE '
## 'H/S LICENSE ' 'H/S OXFORD LICENSE ' 'F/S OVERDYE '
## 'F/S OXFORD LICENSE ' 'F/S COTTON LINEN ' 'F/S PRINTED SHIRTS'
## 'H/S SEERSUCKER SHIRT' 'F/S CHEMBREY' 'H/S CHEMBREY ' 'F/S COTTON SLUB '
## 'F/S SEERSUCKER SHIRT' 'F/S FLANNEL' 'F/S AGE OF WISDOM '
## 'CAMBRIDGE UNIFORM ' 'F/S LIGHT WEIGHT ']
## =====================================
Since there are either repetitive values or nans so we
are dropping Attribute 4 column
percent_missing = df.isnull().sum() * 100 / len(df)
missing_value_df = pd.DataFrame({'percent_missing': percent_missing})
missing_value_df## percent_missing
## BillNo 0.000000
## BillDate 0.000000
## LoyaltyCard 94.967780
## Customer 1.272083
## Description 90.148370
## BillMonth 0.000000
## Warehouse 0.000000
## RegionName 0.000000
## Location 0.000000
## Category 0.000000
## DepartmentName 0.000000
## BrandName 0.231613
## CoBrand 0.000000
## Barcode 0.000000
## DesignNo 0.000000
## Rejection 0.000000
## SeasonName 0.000000
## Attribute1 0.003731
## Attribute2 78.640674
## Attribute3 53.847141
## Attribute4 0.000000
## Attribute5 49.659594
## Attribute6 100.000000
## Attribute7 100.000000
## Attribute8 26.839700
## LocalImport 0.000000
## Color 0.000000
## Sizes 0.000000
## DiscountType 24.951573
## SalesmanName 0.000000
## Qty 0.000000
## SalesReturnReason 97.793404
## Price 0.000000
## Amount 0.000000
## SaleExclGST 0.000000
## GSTP 0.000000
## GST 0.000000
## DiscPer 0.000000
## DiscAmount 0.000000
## BarcodeDiscPer 0.000000
## BarcodeDiscount 0.000000
## NetAmount 0.000000
## PointsEarned 0.000000
## TaxPer 0.000000
## Cobrand Acc 0.000000
msno.matrix(df)Columns Attribute 6 and 7 have 100% missing values. There is no
description available for Attribute5 column so all of these
columns are dropped.
df.drop(['Attribute6','Attribute7'],axis=1,inplace=True)df.Attribute4.unique()## array(['CAMBRIDGE'], dtype=object)
Column Salemanname is a useless feature since we are not
interested to identify the sales by each person to avoid bias in
decision making.
df.drop('Attribute4',axis=1,inplace=True)
df.drop('SalesmanName',axis=1,inplace=True)
df.drop('Category',axis=1,inplace=True)The data between columns CoBrand and
CoBrand Acc is 99% similar so we are keeping only column
out of it.
from fuzzywuzzy import fuzz
fuzz.token_sort_ratio(df['CoBrand'], df['Cobrand Acc'])df.drop('CoBrand',axis=1,inplace=True)
df.drop('Barcode',axis=1,inplace=True)Based on the data description, we are renaming some columns to make it more readable
df = df.rename(columns={'Attribute1':'Inventory_status',
'Attribute2':'Offers','Attribute8':'Class_of_cloth',
'Attribute3':'Import_type'})Using returning and non-returning customers.
We identify the customers by names and assume that customers with identical names is actually a returning customer.
We will look in graphs if the returning customers are more profitable than the non-returning customers.
df['Customer']=df.Customer.duplicated()
df['Customer'].replace(True,'Returning_Costumer',inplace=True) #<<
df['Customer'].replace(False,'NoN_Returning_Costumer',inplace=True) #<<
df.Customer.value_counts()## Returning_Costumer 596405
## NoN_Returning_Costumer 73677
## Name: Customer, dtype: int64
NaNs with
No and Yes for the rest of the values.df['LoyaltyCard']=df.LoyaltyCard.duplicated()
df['LoyaltyCard'].replace(True,'Yes',inplace=True)
df['LoyaltyCard'].replace(False,'No',inplace=True)We used the sklearn imputer to impute the missing values in the data. We used the most frequent value to impute the missing values in the data.
imp = SimpleImputer(strategy="most_frequent")
df['Inventory_status']=imp.fit_transform(df[['Inventory_status']])
imp = SimpleImputer(strategy="most_frequent") #<<
df['Import_type']=imp.fit_transform(df[['Import_type']])
df['Offers']=imp.fit_transform(df[['Offers']])
df['Attribute5']=imp.fit_transform(df[['Attribute5']])
df['BrandName']=imp.fit_transform(df[['BrandName']])
df['Description']=imp.fit_transform(df[['Description']])
df['Inventory_status']=imp.fit_transform(df[['Inventory_status']])
df.SalesReturnReason = df.SalesReturnReason.replace(np.nan,'No information available')
df.Class_of_cloth = df.Class_of_cloth.replace(np.nan, "No information") # replacing Nan values with 0 value
df.DiscountType = df.DiscountType.replace(np.nan, 'No Discount') # replacing Nan values with Most frequent valuedf.describe().T## count mean std ... 50% 75% max
## Qty 670082.0 0.899518 0.611418 ... 1.0 1.0 45.0
## Price 670082.0 2437.725566 714.776793 ... 2259.0 2376.0 6127.0
## Amount 670082.0 2173.785338 1621.785795 ... 2259.0 2376.0 106920.0
## SaleExclGST 670082.0 1740.625980 1489.339984 ... 1663.0 2186.0 77455.0
## GSTP 670082.0 6.092880 2.285465 ... 6.0 6.0 17.0
## GST 670082.0 109.105263 120.435080 ... 95.0 125.0 4647.0
## DiscPer 670082.0 0.382138 3.200854 ... 0.0 0.0 100.0
## DiscAmount 670082.0 8.756731 95.321598 ... 0.0 0.0 14830.0
## BarcodeDiscPer 670082.0 17.706990 20.796915 ... 20.0 30.0 60.0
## BarcodeDiscount 670082.0 423.838609 536.809144 ... 484.0 713.0 32085.0
## NetAmount 670082.0 1849.731244 1581.756942 ... 1747.0 2295.0 82102.0
## PointsEarned 670082.0 1.078650 6.030229 ... 0.0 0.0 821.0
## TaxPer 670082.0 6.359038 0.973759 ... 6.0 6.0 9.0
##
## [13 rows x 8 columns]
Now we will treat the data with outliers and apply some statistical tests
There are many outliers in numerical columns which can be shown by boxplot. We will use the IQR method to remove the outliers.
# Imputing outliers with mean
def impute_outliers_IQR_with_mean(df):
Q1=df.quantile(0.25)
Q3=df.quantile(0.75)
IQR=Q3-Q1
# Lower bound
lower = Q1 - 1.5*IQR
# Upper bound
upper = Q3 + 1.5*IQR
df = np.where(df > upper,
df.mean(),
np.where(
df < lower,
df.mean(), #<<
df
)
)
return df# Library for skewness
from scipy.stats import skew
print('Skew of Price',skew(df.Price))## Skew of Price 1.7332960811793
print('Skew of Amount',skew(df.Amount))## Skew of Amount 0.5264229479147174
print('Skew of SaleExclGST',skew(df.SaleExclGST))## Skew of SaleExclGST 0.053907402377267245
print('Skew of GSTP',skew(df.GSTP))## Skew of GSTP 3.422663381156023
print('Skew of GST',skew(df.GST))## Skew of GST 1.8886742450354679
print('Skew of BarcodeDiscPer',skew(df.BarcodeDiscPer))## Skew of BarcodeDiscPer -0.11344278021015818
print('Skew of BarcodeDiscount',skew(df.BarcodeDiscount))## Skew of BarcodeDiscount 1.639017468026673
print('Skew of NetAmount',skew(df.NetAmount))## Skew of NetAmount 0.11210500286062883
print('Skew of PointsEarned',skew(df.PointsEarned))## Skew of PointsEarned 21.43903163597488
cols=df.select_dtypes(include='number')
cat_cols = cols
i=0
while i < 10:
fig = plt.figure(figsize=[25,4])
plt.subplot(1,3,1)
sns.distplot(a=cat_cols.iloc[:,i], hist=True)
i += 1
plt.subplot(1,3,2)
sns.distplot(a=cat_cols.iloc[:,i], hist=True)
i += 1
plt.show()Now by imputing right skewed by median and left skewed ‘Price’ by mean we will remove the outliers.
df['Qty'] = impute_outliers_IQR_with_median(df['Qty'])
df['Price'] = impute_outliers_IQR_with_mean(df['Price'])
df['Amount'] = impute_outliers_IQR_with_median(df['Amount']) #<<
df['SaleExclGST'] = impute_outliers_IQR_with_median(df['SaleExclGST'])
df['GSTP'] = impute_outliers_IQR_with_median(df['GSTP'])
df['GST'] = impute_outliers_IQR_with_median(df['GST'])
df['BarcodeDiscPer'] = impute_outliers_IQR_with_mean(df['BarcodeDiscPer'])
df['BarcodeDiscount'] = impute_outliers_IQR_with_median(df['BarcodeDiscount'])
df['NetAmount'] = impute_outliers_IQR_with_mean(df['NetAmount'])
df['PointsEarned'] = impute_outliers_IQR_with_median(df['PointsEarned'])
df['DiscPer'] = impute_outliers_IQR_with_median(df['DiscPer'])
df['DiscAmount'] = impute_outliers_IQR_with_median(df['DiscAmount'])We can apply the shapiro wilk test to check if the data is::: nonincremental
# Shapiro wilk test
from scipy.stats import shapiro #<<
print('ShapiroTest of Price',shapiro(df.Price))## ShapiroTest of Price ShapiroResult(statistic=0.8969478011131287, pvalue=0.0)
##
## /opt/anaconda3/lib/python3.9/site-packages/scipy/stats/morestats.py:1760: UserWarning: p-value may not be accurate for N > 5000.
## warnings.warn("p-value may not be accurate for N > 5000.")
print('ShapiroTest of Amount',shapiro(df.Amount))## ShapiroTest of Amount ShapiroResult(statistic=0.908415675163269, pvalue=0.0)
print('ShapiroTest of SaleExclGST',shapiro(df.SaleExclGST))## ShapiroTest of SaleExclGST ShapiroResult(statistic=0.8960244059562683, pvalue=0.0)
print('ShapiroTest of GSTP',shapiro(df.GSTP))## ShapiroTest of GSTP ShapiroResult(statistic=0.6320035457611084, pvalue=0.0)
print('ShapiroTest of GST',shapiro(df.GST))## ShapiroTest of GST ShapiroResult(statistic=0.8931977152824402, pvalue=0.0)
print('ShapiroTest of BarcodeDiscPer',shapiro(df.BarcodeDiscPer))## ShapiroTest of BarcodeDiscPer ShapiroResult(statistic=0.8688030242919922, pvalue=0.0)
print('ShapiroTest of BarcodeDiscount',shapiro(df.BarcodeDiscount))## ShapiroTest of BarcodeDiscount ShapiroResult(statistic=0.8954393863677979, pvalue=0.0)
print('ShapiroTest of NetAmount',shapiro(df.NetAmount))## ShapiroTest of NetAmount ShapiroResult(statistic=0.9133129715919495, pvalue=0.0)
print('ShapiroTest of PointsEarned',shapiro(df.PointsEarned))## ShapiroTest of PointsEarned ShapiroResult(statistic=1.0, pvalue=1.0)
##
## /opt/anaconda3/lib/python3.9/site-packages/scipy/stats/morestats.py:1757: UserWarning: Input data for shapiro has range zero. The results may not be accurate.
## warnings.warn("Input data for shapiro has range zero. The results "
Four common normalization techniques may be useful:
We use the standard scaler to normalize the data.
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler() #<<
df[['Qty','Price','Amount','SaleExclGST','GSTP','GST','BarcodeDiscPer','BarcodeDiscount','NetAmount','PointsEarned','DiscPer','DiscAmount']] = scaler.fit_transform(df[['Qty','Price','Amount','SaleExclGST','GSTP','GST','BarcodeDiscPer','BarcodeDiscount','NetAmount','PointsEarned','DiscPer','DiscAmount']])Checking the normality after normalization
from scipy.stats import shapiro
print('ShapiroTest of Price',shapiro(df.Price))## ShapiroTest of Price ShapiroResult(statistic=0.8945704102516174, pvalue=0.0)
##
## /opt/anaconda3/lib/python3.9/site-packages/scipy/stats/morestats.py:1760: UserWarning: p-value may not be accurate for N > 5000.
## warnings.warn("p-value may not be accurate for N > 5000.")
print('ShapiroTest of Amount',shapiro(df.Amount))## ShapiroTest of Amount ShapiroResult(statistic=0.9033786058425903, pvalue=0.0)
print('ShapiroTest of SaleExclGST',shapiro(df.SaleExclGST))## ShapiroTest of SaleExclGST ShapiroResult(statistic=0.8960385918617249, pvalue=0.0)
print('ShapiroTest of GSTP',shapiro(df.GSTP))## ShapiroTest of GSTP ShapiroResult(statistic=0.6283426880836487, pvalue=0.0)
print('ShapiroTest of GST',shapiro(df.GST))## ShapiroTest of GST ShapiroResult(statistic=0.8964444398880005, pvalue=0.0)
print('ShapiroTest of BarcodeDiscPer',shapiro(df.BarcodeDiscPer))## ShapiroTest of BarcodeDiscPer ShapiroResult(statistic=0.8605928421020508, pvalue=0.0)
print('ShapiroTest of BarcodeDiscount',shapiro(df.BarcodeDiscount))## ShapiroTest of BarcodeDiscount ShapiroResult(statistic=0.8955459594726562, pvalue=0.0)
print('ShapiroTest of NetAmount',shapiro(df.NetAmount))## ShapiroTest of NetAmount ShapiroResult(statistic=0.9149541854858398, pvalue=0.0)
print('ShapiroTest of PointsEarned',shapiro(df.PointsEarned))## ShapiroTest of PointsEarned ShapiroResult(statistic=1.0, pvalue=1.0)
##
## /opt/anaconda3/lib/python3.9/site-packages/scipy/stats/morestats.py:1757: UserWarning: Input data for shapiro has range zero. The results may not be accurate.
## warnings.warn("Input data for shapiro has range zero. The results "
print('ShapiroTest of DiscPer',shapiro(df.DiscPer))## ShapiroTest of DiscPer ShapiroResult(statistic=1.0, pvalue=1.0)
print('ShapiroTest of DiscAmount',shapiro(df.DiscAmount))## ShapiroTest of DiscAmount ShapiroResult(statistic=1.0, pvalue=1.0)
## Ttest_indResult(statistic=0.6562095394492546, pvalue=0.5116895372324733)
#
summary, results = rp.ttest(group1= df_karachi['Price'], group1_name= "Karachi",
group2= df_lahore['Price'], group2_name= "Lahore")
resultsdf1=pd.read_csv('/Users/snawaz/Documents/pychilla2/teamproject_sep3/Deep_note_linked/cleaned_data.csv')Dependent variables in various ML algorithms will be * Price for
linear regression where other select features will be independent
variable. * Price for multilenar regression where all other variables
will be independent. * Inventory_status and LocalImport
where all other are independent variables in classification ML model. *
Price in Deep learning model * 2 Clustering Models for SeasonName.
One_Hot encoding for 2 variables
Attribute8 which is actually the class of the sales and
Local ImportOne_hot_encoded_data = df_Ml[['Offers']]
enc = OneHotEncoder()
enc_results = enc.fit_transform(One_hot_encoded_data)
enc=pd.DataFrame(enc_results.toarray(), columns=enc.categories_)Inventory_status &
LocalImportList of Libraries used in this all Machine learning section
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import missingno as msno
import plotly.express as px
import scipy.special
from bokeh.layouts import gridplot
from bokeh.plotting import figure, show
import scipy.stats as stats
import xgboost as xgb
import descartes
import plotly.express as px
import folium
import sklearn
## ML libraries
from xgboost import XGBRegressor
from sklearn.linear_model import LogisticRegression,LinearRegression,RidgeCV,Lasso,LassoCV
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import LabelEncoder,OneHotEncoder
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import ExtraTreesClassifier,RandomForestRegressor,RandomForestClassifier,GradientBoostingClassifier
from sklearn.model_selection import train_test_split,learning_curve, cross_val_predict,cross_validate,cross_val_score,KFold
from xgboost import XGBClassifier
from sklearn.tree import DecisionTreeClassifier,DecisionTreeRegressor
from sklearn.svm import SVR,LinearSVR,SVC
from sklearn.feature_selection import RFECV , RFE
from time import time
from lightgbm import LGBMRegressor
from sklearn.metrics import pairwise_distances
from sklearn.cluster import KMeans
from sklearn.model_selection import cross_val_score,RepeatedStratifiedKFold,KFold
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression,OrthogonalMatchingPursuit
from sklearn.ensemble import StackingRegressor,ExtraTreesRegressor,RandomForestRegressor,GradientBoostingRegressor,AdaBoostRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsRegressor
from sklearn.model_selection import train_test_split, GridSearchCV,RepeatedKFold,RepeatedStratifiedKFold
from sklearn.metrics import f1_score, classification_report, accuracy_score, mean_squared_error, precision_score
from sklearn.metrics.cluster import adjusted_rand_score,contingency_matrix
from numpy import unique
from numpy import where
from sklearn.cluster import DBSCAN,KMeans,MiniBatchKMeans
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_samples, silhouette_score
from yellowbrick.cluster import SilhouetteVisualizer
# libraries of DL
from time import time
from sklearn.multioutput import MultiOutputRegressor
from sklearn.datasets import make_regression
from keras.models import Sequential
from keras.layers import Dense
from numpy import asarray
from pandas import set_option
from sklearn.pipeline import Pipeline
#scores
from sklearn.metrics import precision_recall_curve,confusion_matrix,mean_squared_error,mean_absolute_error,explained_variance_score,max_error,r2_score,median_absolute_error,mean_squared_log_error,silhouette_scoredf_M = df_Ml.sample(10000).reset_index(drop=True)X=df_M.drop('LocalImport',axis=1)
y=df_M['LocalImport']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=44, shuffle =True)final_clf = None
clf_names = ["Logistic Regression", "KNN(3)", "XGBoost Classifier", "Random forest classifier", "Decision Tree Classifier",
"Gradient Boosting Classifier", "Support Vector Machine"]classifiers = []
scores = []
for i in range(10):
tempscores = []
# logistic Regression
lr_clf = LogisticRegression(n_jobs=-1)
lr_clf.fit(X_train, y_train)
tempscores.append((lr_clf.score(X_test, y_test))*100)
# KNN n_neighbors = 3
knn3_clf = KNeighborsClassifier(n_jobs=-1)
knn3_clf.fit(X_train, y_train)
tempscores.append((knn3_clf.score(X_test, y_test))*100)
# XGBoost
xgbc = XGBClassifier(n_jobs=-1,seed=41)
xgbc.fit(X_train, y_train)
tempscores.append((xgbc.score(X_test, y_test))*100)
# Random Forest
rf_clf = RandomForestClassifier(n_jobs=-1)
rf_clf.fit(X_train, y_train)
tempscores.append((rf_clf.score(X_test, y_test))*100)
# Decision Tree
dt_clf = DecisionTreeClassifier()
dt_clf.fit(X_train, y_train)
tempscores.append((dt_clf.score(X_test, y_test))*100)
# Gradient Boosting
gb_clf = GradientBoostingClassifier()
gb_clf.fit(X_train, y_train)
tempscores.append((gb_clf.score(X_test, y_test))*100)
#SVM
svm_clf = SVC(gamma = "scale")
svm_clf.fit(X_train, y_train)
tempscores.append((svm_clf.score(X_test, y_test))*100)
scores.append(tempscores)
scores = np.array(scores)
clfs = pd.DataFrame({"Classifier":clf_names})
for i in range(len(scores)):
clfs['iteration' + str(i)] = scores[i].T
means = clfs.mean(axis = 1)
means = means.values.tolist()
clfs["Average"] = means
clfs.set_index("Classifier", inplace = True)
print("Accuracies : ")
clfs["Average"].head(10)image
LocalImport
variable.X=df_M.drop('LocalImport',axis=1)
y=df_M['LocalImport']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=44, shuffle =True)
print('X_train shape is ' , X_train.shape)
print('X_test shape is ' , X_test.shape)
print('y_train shape is ' , y_train.shape)
print('y_test shape is ' , y_test.shape)RFE to select the best features for each model.features_importancemodel = GradientBoostingClassifier()
model.fit(X,y)
plt.style.use('ggplot')
plt.figure(figsize=(6,10))
feat_importances = pd.Series(model.feature_importances_, index=X.columns)
feat_importances.nlargest(50).plot(kind='barh')
plt.savefig("extra_tree.png",dpi=200)
plt.show()image
Only 7 features are good enough out of 38 to train our final model. so we drop rest of the features below.
df_M = df_Ml[['BrandName','Attribute5','GSTP','SeasonName','BillMonth','Price','DesignNo','Import_type','LocalImport']]We could also select pearson corelation coefficient values by
corelation chart below.
Making a dictionary of the hyperparamters for
Gradient boost model. We will apply grid cross validation
approach to select best hyperparamters for our selected model.
#Creating Parameters
params = {
'learning_rate':[0.1,1],
'n_estimators':[5,9, 11,12,13,15,20,25,26,29,31,20,50,75,100],
'max_features':['auto','sqrt','log2'],
'criterion':['friedman_mse', 'squared_error', 'mse'],
'loss':['log_loss', 'deviance', 'exponential']
}
#Fitting the model
from sklearn.model_selection import GridSearchCV
rf = GradientBoostingClassifier()
grid = GridSearchCV(rf, params, cv=3, scoring='accuracy')
grid.fit(X_train, y_train)
print(grid.best_params_)
print("Accuracy:"+ str(grid.best_score_)){‘criterion’: ‘friedman_mse’, ‘learning_rate’: 1, ‘loss’: ‘exponential’, ‘max_features’: ‘auto’, ‘n_estimators’: 75} Accuracy:1.0
It gives us the best hyperpatamters at scoring paramter chosen as accuracy.
Remember model will remain the same with similar X_train, y_train data.
# applying model with best hyperparameters
rf = GradientBoostingClassifier(criterion='friedman_mse', learning_rate=1, loss='exponential', max_features='auto', n_estimators=75)
rf.fit(X_train, y_train)
y_pred = rf.predict(X_test)
# from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
print("Accuracy Score: ", accuracy_score(y_test, y_pred))
print("Confusion Matrix: ", confusion_matrix(y_test, y_pred)) Accuracy Score: 1.0 Confusion Matrix: [[ 9895 0] [ 0 191130]]
from sklearn.naive_bayes import GaussianNB
from yellowbrick.classifier import ClassificationReport
# Instantiate the classification model and visualizer
visualizer = ClassificationReport(rf)
visualizer.fit(X_train, y_train) # Fit the visualizer and the model
visualizer.score(X_test, y_test) # Evaluate the model on the test data
visualizer.show() image
import pickle
pkl_filename = "localImport_model.pkl"
with open(pkl_filename, 'wb') as file:
pickle.dump(grid, file)
# Load from file
with open(pkl_filename, 'rb') as file:
pickle_model = pickle.load(file)
# Calculate the accuracy score and predict target values
score = pickle_model.score(X_test, y_test)
y_predict = pickle_model.predict(X_test)Best Params: {‘criterion’: ‘entropy’, ‘max_features’: None, ‘n_estimators’: 75, ‘random_state’: 1} Train MSE: 0.0 Test MSE: 0.146
modelkn = RandomForestClassifier(random_state=1,criterion= 'entropy' ,n_jobs=-1, n_estimators=75, max_features=None)
modelkn= modelkn.fit(X_train,y_train)
y_11 = modelkn.predict(X_test)Model scores for inventory_status
image
Confusion matrix for inventory_status
image
Target features: Price of article. &
SaleExclGST gained by company +NetAmount paid
by customer.
In one regression model there is only 1 target feature while in the other there are two target features.
Only difference from the previous approach is the use of RepeatedKFold cross validation method to get the best model.
Model evaluation is done by R2 score and negative
RMSE score.
Again we use sample of 10000 rows to compare models followed by full sample to train the final model.
models = {}
models['lr']= LinearRegression()
models['dr'] = DecisionTreeRegressor()
models['rf'] = RandomForestRegressor()
models['kn']= KNeighborsRegressor()
models['ad'] = AdaBoostRegressor()
models['ex'] = ExtraTreesRegressor()
models['sv'] = SVR()
from sklearn import model_selection
for model in models:
cv = sklearn.model_selection.RepeatedKFold(n_splits=100,n_repeats=1,random_state=1)
n_scores = model_selection.cross_val_score(models[model],X,y,scoring='neg_root_mean_squared_error',cv=cv,n_jobs=-1)
print(model, np.mean(n_scores),np.std(n_scores))image
RFE and SelectKBestimage
image
Regression Models comparison for prediciting price. Neg. RMSE score is used as evaluation metric.
image
optimum features selection for price prediction
image
image
In the we get R2 score of 1
image
and other score are given below
R-squared score (training): 1.000 R-squared score (test): 0.997
Test MSE: 0.002528635026436538 Test RMSE: 0.001264317513218269
We will select the variable seasonName. Finding the best
number of clusters using elbow method and
silhouette score.
#Use silhouette score
range_n_clusters = list (range(2,10))
print ("Number of clusters from 2 to 9: \n", range_n_clusters)
fig, ax = plt.subplots(3, 2, figsize=(15,8))
for n_clusters in range_n_clusters:
clusterer = MiniBatchKMeans(n_clusters=n_clusters)
preds = clusterer.fit_predict(df_Ml)
centers = clusterer.cluster_centers_
q, mod = divmod(n_clusters, 2)
score = silhouette_score(df_Ml, preds)
print("For n_clusters = {}, silhouette score is {})".format(n_clusters, score))
visualizer = SilhouetteVisualizer(clusterer, colors='yellowbrick', ax=ax[q-1][mod])
visualizer.fit(df_Ml)inertias = []
for n_clusters in range(2, 15):
km = KMeans(n_clusters=n_clusters).fit(df_Ml)
inertias.append(km.inertia_)
plt.plot(range(2, 15), inertias, ‘k’)
plt.title(“Inertia vs Number of Clusters”)
plt.xlabel(“Number of clusters”)
plt.ylabel(“Inertia”)# define the model
model = MiniBatchKMeans(n_clusters=6)
# fit the model
model.fit(df_Ml)
# assign a cluster to each example
yhat = model.predict(df_Ml)
# retrieve unique clusters
clusters = unique(yhat)
# create scatter plot for samples from each cluster
for cluster in clusters:
# get row indexes for samples with this cluster
row_ix = where(yhat == cluster)
# create scatter of these samples
plt.scatter(df_Ml[row_ix, 0], df_Ml[row_ix, 1])
# show the plot
plt.show()# define the model
dbscan_model = DBSCAN(eps=0.25, min_samples=9)
# train the model
dbscan_model.fit(df_Ml)
# assign each data point to a cluster
dbscan_result = dbscan_model.predict(df_Ml)
# get all of the unique clusters
dbscan_cluster = unique(dbscan_result)
# plot the DBSCAN clusters
for dbscan_cluster in dbscan_clusters:
# get data points that fall in this cluster
index = where(dbscan_result == dbscan_clusters)
# make the plot
plt.scatter(df_Ml[index, 0], df_Ml[index, 1])
# show the DBSCAN plot
plt.show()Deep learning model for predicting price and comparison with regresssion models
We are using the sequential model with 4 fully-connected layers. ReLU is more popular in many deep neural networks, but I am using Tanh for activation on trial basis.
You almost never use Sigmoid because it is slow to train. We can add drop out layer to reduce overfitting
Adam loss function is used for model compilation.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense, Activation, Dropout
from tensorflow.keras.optimizers import Adam
X_train = np.array(X_train)
X_test = np.array(X_test)
y_train = np.array(y_train)
y_test = np.array(y_test)
model = Sequential()
model.add(Dense(X_train.shape[1], activation='relu'))
model.add(Dense(32, activation='Tanh'))
model.add(Dropout(0.2))
model.add(Dense(64, activation='Tanh'))
model.add(Dropout(0.2))
model.add(Dense(128, activation='Tanh'))
# model.add(Dropout(0.2))
model.add(Dense(512, activation='Tanh'))
model.add(Dropout(0.1))
model.add(Dense(1))
model.compile(optimizer=Adam(0.00001), loss='mse')
r = model.fit(X_train, y_train,
validation_data=(X_test,y_test),
batch_size=1,
epochs=100)Cambridge Industries makes a database of their sales. All the information related to customer choices such as loyalty card, product exchanges and type of product bought is available is the database. We have performed EDA analysis followed by data cleaning and feature engineering. Statistical aanalysis show that there is no significant difference between sales offline and online sales offered by the company. We have taken subset of data based on different products and locations. A map was shown where the products are bought and delivered. We have used different machine learning models to predict the price of the product such regression, classification, clustering and deep neural network with output features Sales (revenue of the company), inventory_status, SeasonName and Price respectively.